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PHTMFPTC PROTEINS COMPRISING BORRELI A POLYPEPTIDES; 

USES THEREFOR 

Background of the Invention 

Lyme borreliosis is the most common tick-borne 
5 infectious disease in North America, Europe, and 

northern Asia. The causative bacterial agent of this 
disease, Borrelia burgdorferi, was first isolated and 
cultivated in 1982 (Burgdorferi, W.A. et al . , Science 
216 ; 1317-1319 (1982); Steere, A.R. et al . , N . Engl. J. 
10 Med. 308 ; 733-740 (1983)). With that discovery, a wide 
array of clinical syndromes, described in both the 
European and American literature since the early 2 0th 
century, could be attributed to infection by B. 
burgdorferi (Afzelius, A., Acta Der m. Venereol. 2; 120- 
15 125 (1921); Bannwarth, A., Arch. P svchiatr. 

Nervenkrankh . 117 ; 161-185 ( 1944 ) ; - Gar in, C. and A. 
Bujadouz, J. Med. Lvon 71 ; 765-767 (1922); Herxheimer, 
K. and K. Hartmann, Arch. Dermatol. Svphilol. 61; 57-76, 
255-300 (1902)). 
2 0 The immune response to B. burgdorferi is 

characterized by an early, prominent, and persistent 
humoral response to the end of lagellar protein, p41 
(f la) , and to a protein constituent of the protoplasmic 
cylinder, p93 (Szczepanski, A., and J.L. Benach, 
25 Microbiol. Rev. 55 ; 21 (1991)). The p41 flagellin 

antigen is an immunodominant protein; however, it shares 
significant homology with flagellihs of other 
microorganisms and therefore is highly cross reactive. 
The p93 antigen is the largest immunodominant antigen of 
30 B. burgdorferi. Both the p41 and p93 proteins are 

physically cryptic antigens, sheathed from the immune 
system by an outer membrane whose major protein 
constituents are the outer surface proteins A and B 
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(OspA and OspB) . OspA is a basic lipoprotein of 
approximately 31 kd, which is encoded on a large linear 
plasmid along with OspB, a basic lipoprotein of 
approximately 34 kd (Szczepanski , A., and J.L. Benach, 
5 Microbiol. Rev. 55 :21 (1991)). Analysis of isolates of 
B. burgdorferi obtained from North America and Europe 
has demonstrated that OspA has antigenic variability, 
and that several distinct groups can be serologically 
and genotypically defined (Wilske, B. , et al . , World J. 
10 Microbiol. 7 : 130 (1991))- Other Borrelia proteins 
demonstrate similar antigenic variability. 
Surprisingly, the immune response to these outer surface 
proteins tends to occur late in the disease, if at all 
(Craft, J. E. et ai . , J. Clin Invest. 78 : 934-939 
15 (1986) ; Dattwyler, R.J. and B.J. Luft, Rheum. Clin. 

North Am. 15 : 727-734 (1989)). Furthermore, patients 
acutely and chronically infected with B . burgdorferi 
respond variably to the different antigens, including 
OspA, OspB, OspC, OspD, p39, p41 and p93. 
2 0 Vaccines against Lyme borreliosis have been 

attempted. Mice immunized with a recombinant form of 
OspA are protected from challenge with the same strain 
of B . burgdorferi from which the protein was obtained 
(Fikrig, E. , et al . , Science 250 : 553-556 (1990)). 
2 5 Furthermore, passively transferred anti-OspA monoclonal 
antibodies (Mabs) have been shown to be protective in 
mice, and vaccination with a recombinant protein induced 
protective immunity against subsequent infection with 
the homologous strain of B .burgdorferi (Simon, M.M. , et 
30 al., J. Infect. Pis. 164 : 123 (1991)). Unfortunately, 
immunization with a protein from one strain does not 
necessarily confer resistance to a heterologous strain 
(Fikrig, E. et al . , rr. Immunol. 7 : 2256-1160 (1992)), 
but rather, is limited to the homologous 'species' from 
3 5 which the protein was prepared. Furthermore, 
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immunization with a single protein from a particular 
strain of Borrelia will not confer resistance to that 
strain in all individuals. There is considerable 
variation displayed in OspA and OspB, as well as p93, 
including the regions conferring antigenicity. 
Therefore, the degree and frequency of protection from 
vaccination with a protein from a single strain depend' 
upon the response of the immune system to the particular 
variation, as well as the frequency of genetic variation 
in B • burgdorferi. Currently, a need exists for a 
vaccine which provides immunogenicity across species and 
to more epitopes within a species, as well as 
immunogenicity against more than one protein. 

Summary of the Invention 
15 The current invention pertains to chimeric Borrelia 

proteins which include two or more antigenic Borrelia 
polypeptides which do not occur naturally (in nature) in 
the same protein in Borrelia, as well as the nucleic 
acids encoding such chimeric proteins. The antigenic 
2 0 polypeptides incorporated in the chimeric proteins are 
derived from any Borrelia protein from any strain of 
Borrelia, and include outer surface protein (Osp) A, 
OspB, OspC, OspD, p!2, p39, p41, p66, and p93. The 
proteins from which the antigenic polypeptides are 
25 derived can be from the same strain of Borrelia, from 

different strains, or from combinations of proteins from 
the same and from different strains. If the proteins 
from which the antigenic polypeptides are derived are 
OspA or OspB, the antigenic polypeptides can be derived 
from either the portion of the OspA or OspB protein 
present between the amino terminus and the conserved 
tryptophan of the protein (referred to as a proximal 
portion) , or the portion of the OspA or OspB protein 
present between the conserved tryptophan of the protein 



30 
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and the carboxy terminus (referred to as a distal 
portion). Particular chimeric proteins, and the 
nucleotide sequences encoding them, are set forth in 
Figures 23-37 and 43-46. 
5 The chimeric proteins of the current invention 

provide antigenic polypeptides of a variety of Borrelia 
strains and/or proteins within a single protein. Such* 
proteins are particularly useful in immunodiagostic 
assays to detect the presence of antibodies to native 

10 Borrelia in potentially infected individuals as well as 
to measure T-cell reactivity, and can therefore be used 
as immunodiagnostic reagents. The chimeric proteins of 
the current invention are additionally useful as vaccine 
immunogens against Borrelia infection. 

15 For a better understanding of the present invention 

together with other and further objects, reference is 
made to the following description, taken together with 
the accompanying drawings. 

Brief Description of the Drawings 
2 0 Figure 1 summarizes peptides and antigenic domains 

localized by proteolytic and chemical fragmentation of 
OspA. 

Figure 2 is a comparison of the antigenic domains 
depicted in Figure 1, for OspA in nine strains of B . 

2 5 burgdorferi . 

Figure 3 is a graph depicting a plot of weighted 
polymorphism versus amino acid position among 14 OspA 
variants. The marked peaks are: a) amino acids 13 2-14 5; 
b) amino acids 163-177; c) amino acids 208-221. The 

30 lower dotted line at polymorphism value 1.395 demarcates 
statistically significant excesses of polymorphism at p 
= 0.05. The upper dotted line at 1.52 0 is the same, 
except that the first 29 amino acids at the monomorphic 
N-terminus have been removed from the original analysis. 
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Figure 4 depicts the amino acid alignment of 
residues 200 through 220 for OspAs from strains B31 and 
K48 as well as for the site-directed mutants 613, 625, 
640, 613/625, and 613/640- Arrow indicates Trp216. 
5 Amino acid changes are underlined. 

Figure 5 is a helical wheel projection of residues 
204-217 of B31 OspA. Capital letters indicate 
hydrophobic residues; lower case letters indicate 
hydrophilic residues; +/- indicate positively/negatively 
10 charged residues. Dashed line indicates division of the 
alpha-helix into hydrophobic arc (above the line) and 
polar arc (below the line). Adapted from France et al . 
( Biochem. Biophvs. Acta 1120 : 59 (1992)). 

Figure 6 depicts a phylogenic tree for strains of 
15 Borreliet described in Table I. The strains are as 
follows: 1 = B31; 2 = Pkal; 3 = 2S7 ; 4 = N40; 5 = 
25015; 6 = K48; 7 = DK29 ; 8 = PHei; 9 = Ip90; 10 = 
PTrob; 11 = ACAI ; 12 = PGau; 13 = Ip3 ; 14 = PBo; 15 = . 
PKo. 

20 Figure 7 depicts the nucleic acid sequence of OspA- 

B31 (SEW ID NO. 6) , and the encoded protein sequence 
(SEQ ID NO. 7) . 

Figure 8 depicts the nucleic acid sequence of OspA- 
K48 (SEQ ID NO. 8), and the encoded protein sequence 
25 (SEQ ID NO. 9) . 

Figure 9 depicts the nucleic acid sequence of OspA- 
PGau (SEQ ID NO. 10), and the encoded protein sequence 
(SEQ ID NO. 11) . 

Figure 10 depicts the nucleic acid sequence of 
30 OspA-25015 (SEQ ID NO. 12), and the encoded protein 
sequence (SEQ ID NO. 13). 

Figure 11 depicts the nucleic acid sequence of 
0spB-B31 (SEQ ID NO. 21) , and the encoded protein 
sequence (SEQ ID NO. 22) . 
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Figure 12 depicts the nucleic acid sequence of 
OspC-B31 (SEQ ID NO. 29), and the encoded protein 
sequence (SEQ ID NO. 3 0). 

Figure 13 depicts the nucleic acid sequence of 
5 OspC-K48 (SEQ ID NO. 31) , and the encoded protein 
sequence (SEQ ID NO. 32). 

Figure 14 depicts the nucleic acid sequence of 
OspC-PKo (SEQ ID NO. 33), and the encoded protein 
sequence (SEQ ID NO. 34). 
10 Figure 15 depicts the nucleic acid sequence of 

OspC-pTrob (SEQ ID NO. 35) and the encoded protein 
sequence (SEQ ID NO. 36) . 

Figure 16 depicts the nucleic acid sequence of p93- 
B31 (SEQ ID NO. 65) and the encoded protein sequence 
15 (SEQ ID NO. 66) . 

Figure 17 depicts the nucleic acid sequence of p93- 

K48 (SEQ ID NO. 67) . 

Figure 18 depicts the nucleic acid sequence of p93- 
PBo (SEQ ID NO. 69) . 
20 Figure 19 depicts the nucleic acid sequence of p93- 

pTrob (SEQ ID NO. 71). 

Figure 20 depicts the nucleic acid sequence of p93- 
pGau (SEQ ID NO. 73) . 

Figure 21 depicts the nucleic acid sequence of p93- 

25 25015 (SEQ ID NO. 75) . 

Figure 22 depicts the nucleic acid sequence of p93- 
pKo (SEQ ID NO. 77) . 

Figure 23 depicts the nucleic acid sequence of the 
OspA-K4 8/OspA-PGau chimer (SEQ ID NO. 85) and the 
3 0 encoded chimeric protein sequence (SEQ ID NO. 86) . 

Figure 24 depicts the nucleic acid sequence of the 
OspA-B31/OspA-PGau chimer (SEQ ID NO. 88) and the 
encoded chimeric protein sequence (SEQ ID NO. 89) . 
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Figure 25 depicts the nucleic acid sequence of the 
OspA-B31/OspA-K48 chimer (SEQ ID NO. 91) and the encoded 
chimeric protein sequence (SEQ ID NO ♦ 92). 

Figure 2 6 depicts the nucleic acid sequence of the 
5 OspA-B31/OspA-25015 chimer (SEQ ID NO. 94) and the 
encoded chimeric protein sequence (SEQ ID NO. 95) . 

Figure 27 depicts the nucleic acid sequence of the 
OspA-K48/OspA-B31/OspA-K48 chimer (SEQ ID NO. 97) and 
the encoded chimeric protein sequence (SEQ ID NO. 98). 
10 Figure 28 depicts the nucleic acid sequence of the 

OspA-B31/OspA-K48/OspA-B31/OspA-K48 chimer (SEQ ID NO. 
100) and the encoded chimeric protein sequence (SEQ ID 
NO. 101) . 

Figure 29 depicts the nucleic acid sequence of the 
15 OspA-B31/OspB-B31 chimer (SEQ ID NO. 103) and the 

encoded chimeric protein sequence (SEQ ID NO. 104) . 

Figure 3 0 depicts the nucleic acid sequence of the 
OspA-B31/OspB-B3i/OspC-B31 chimer (SEQ ID NO. 106) and 
the encoded chimeric protein sequence (SEQ ID NO. 107). 
2 0 Figure 31 depicts the nucleic acid sequence of the 

OspC-B31/OspA-B31/OspB-B31 chimer (SEQ ID NO. 109) and 
the encoded chimeric protein sequence (SEQ ID NO. 110) . 

Figure 32 depicts the nucleic acid sequence of the 
OspA-B31/p93-B31 chimer (SEQ ID NO. Ill) and the encoded 
25 chimeric protein sequence (SEQ ID NO. 112). 

Figure 3 3 depicts the nucleic acid sequence of the 
OspB-B31/p41-B31 (122-234) chimer (SEQ ID NO. 113) and 
the encoded chimeric protein sequence (SEQ ID NO. 114) . 
Figure 3 4 depicts the nucleic acid sequence of the 
30 OspB-B31/p41-B31 (122-295) chimer (SEQ ID NO. 115) and 
the encoded chimeric protein sequence (SEQ ID NO. 116) . 

Figure 3 5 depicts the nucleic acid sequence of the 
OspB-B31/p41-B31 (140-234) chimer (SEQ ID NO. 117) and 
the encoded chimeric protein sequence (SEQ ID NO. 118). 
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Figure 3 6 depicts the nucleic acid sequence of the 
OspB-B31/p41-B31 (140-295) chimer (SEQ ID NO. 119) and 
the encoded chimeric protein sequence (SEQ ID NO. 120) . 
Figure 3 7 depicts the nucleic acid sequence of the 
5 OspB-B31/p41-B31 ( 122-234 ) /0spC-B3 1 chimer (SEQ ID NO. 
121) and the encoded chimeric protein sequence (SEQ ID 
NO. 122) . 

Figure 38 depicts an alignment of the nucleic acid 
sequences for OspC-B31 (SEQ ID NO. 29), OspC-PKo (SEQ ID 
10 NO. 33), OspC-pTrob (SEQ ID NO. 35), and OspC-K48 (SEQ 
ID NO. 31) . Nucleic acids which are identical to those 
in the lead nucleic acid sequence (here, OspC-B31) are 
represented by a period (.); differing nucleic acids are 
shown in lower case letters. 
!5 Figure 39 depicts an alignment of the nucleic acid 

sequences for OspD-pBO (SEQ ID NO. 123), OspD-PGau (SEq 
ID NO. 124), OspD-DK29 (SEQ ID NO. 125), and OspD-K48 
(SEQ ID NO. 126) . Nucleic acids which are identical to 
those in the lead nucleic acid sequence (here, OspD-pBo) 
20 are represented by a period ( . ) ; differing nucleic acids 
are shown in lower case letters. 

Figure 40 depicts the nucleic acid sequence of p41- 
B31 (SEq ID NO. 127) and then encoded protein sequence 
(SEQ ID NO. 128) . 
25 Figure 41 depicts an alignment of the nucleic acid 

sequences for p41-B31 (SEQ ID NO. 127), p41-pKal (SEQ ID 
NO. 129), p41-PGau (SEQ ID NO. 51), p41-PBo (SEQ ID NO. 

130) , p41-DK29 (SEQ ID NO. 53), and p41-PKo (SEQ ID NO. 

131) . Nucleic acids which are identical to those in the 
30 lead nucleic acid sequence (here, p41-B31) are 

represented by a period (.); differing nucleic acids are 
shown in lower case letters. 

Figure 4 2 depicts an alignment of the nucleic acid 
sequences for OspA-B31 (SEQ ID NO. 6), OspA-pKal (SEQ ID 
35 NO. 132), OspA-N40 (SEQ ID NO. 133), OspA-ZS7 (SEQ ID 
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NO. 134), OspA-25015 (SEQ ID NO. 12) , OspA-pTrob (SEQ ID 

NO. 135) , OspA-K4 8 (SEQ ID NO. 8) , OspA-Hei (SEQ ID NO. 

136), OspA-DK29 (SEQ ID NO. 49), OSpA-Ip90 (SEQ ID NO. 

50), OspA-pBo (Seq ID NO. 55), OspA-Ip3 (SEQ ID NO. 56), 
5 OspA-PKo (SEQ ID NO. 57) , OspA-ACAI (SEQ ID NO. 58) , and 

OspA-PGau (SEQ ID NO. 10) . Nucleic acids which are 

identical to those in the lead nucleic acid sequence 

(here, OspA-B31) are represented by a period (.); 

differing nucleic acids are shown in lower case letters. 
10 Figure 4 3 depicts the nucleic acid sequence of the 

OspA-Tro/OspA-Bo chimer (SEQ ID NO. 13 7) and the encoded 

chimeric protein sequence (SEQ ID NO. 138) . 

Figure 44 depicts the nucleic acid sequence of the 

OspA-PGau/OspA-Bo chimer (SEQ ID NO. 139) and the 
15 encoded chimeric protein sequence (SEQ ID NO. 140). 

Figure 4 5 depicts the nucleic acid sequence of the 

OspA-B31/OspA-PGau/OspA-B31/OspA-K48 chimer (SEQ ID NO. 

141) and the encoded chimeric protein sequence (SEQ ID 

NO. 142) . 

20 Figure 46 depicts the nucleic acid sequence of the 

OspA-PGau/OspA-B31/OspA-K48 chimer (SEQ ID NO. 143) and 
the encoded chimeric protein sequence (SEQ ID NO. 144). 

Detailed Description of the Invention 

The current invention pertains to chimeric proteins 

2 5 comprising antigenic Borrelia polypeptides which do not 
occur in nature in the same Borrelia protein. The 
chimeric proteins are a combination of two or more 
antigenic polypeptides derived from Borrelia proteins. 
The antigenic polypeptides can be derived from different 

30 proteins from the same species of Borrelia , or different 
proteins from different Borrelia species, as well as 
from corresponding proteins from different species. As 
used herein, the term "chimeric protein" describes a 
protein comprising two or more polypeptides which are 
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derived from corresponding and/ or non-corresponding 
native Borrelia protein. A polypeptide "derived from" a 
native Borrelia protein is a polypeptide which has an 
amino acid sequence the same as an amino acid sequence 
present in a Borrelia protein, an amino acid sequence 
equivalent to the amino acid sequence of a naturally 
occurring Borrelia protein, or an amino acid sequence 
substantially similar to the amino acid sequence of a 
naturally occurring Borrelia protein (e.g., differing by 
few amino acids) such as when a nucleic acid encoding a 
protein is subjected to site-directed mutagenesis. 
"Corresponding" proteins are equivalent proteins from 
different species or strains of Borrelia, such as outer 
surface protein A (OspA) from strain B31 and OspA from 
15 strain K48. The invention additionally pertains to 
nucleic acids encoding these chimeric proteins. 

As described below, Applicants have identified two 
separate antigenic domains of OspA and OspB which flank 
the sole conserved tryptophan present in OspA and in 
OspB. These domains share cross-reactivity with 
different genospecies of Borrelia. The precise amino 
acids responsible for antigenic variability were 
determined through site-directed mutagenesis, so that 
proteins with specific amino acid substitutions are 
25 available for the development of chimeric proteins. 

Furthermore, Applicants have identified immunologically 
important hypervariable domains in OspA proteins, as 
described below in Example 2. The first hypervariable 
domain of interest for chimeric proteins, Domain A, 
30 includes amino acid residues 120-140 of OspA, the second 
hypervariable domain, Domain B, includes residues 150- 
180 and the third hypervariable domain, Domain C, 
includes residues 200-216 or 217 (depending on the 
position of the sole conserved tryptophan residue in the 
35 OspA of that particular species of Borrelia) (see Figure 
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3). In addition, Applicants have sequenced the genes 
for several Borrelia proteins. 

These discoveries have aided in the development of 
novel recombinant Borrelia proteins which include two or 
5 more amino acid regions or sequences which do not occur 
in the same Borrelia protein in nature. The recombinant 
proteins comprise polypeptides from a variety of 
Borrelia proteins, including, but not limited to, OspA, 
OspB, OspC, OspD, p!2, p39, p41, p66, and p93. 
10 Antigenically relevant polypeptides from each of a 

number of proteins are combined into a single chimeric 
protein. 

In one embodiment of the current invention, chimers 
are now available which include antigenic polypeptides 
15 flanking a tryptophan residue. The antigenic 

polypeptides are derived from either the proximal 
portion from the tryptophan (the portion of the OspA or 
OspB protein present between the amino terminus and the 
conserved tryptophan of the protein) , or the distal 
20 portion from the tryptophan (the portion of the OspA or 
OspB protein present between the conserved tryptophan of 
the protein and the carboxy terminus) in OspA and/or 
OspB. The resultant chimers can be OspA-OspA chimers 
(i.e., chimers incorporating polypeptides derived from 
25 OspA from different strains of Borrelia) , OspA-OspB 

chimers, or OspB-OspB chimers, and are constructed such 
that amino acid residues amino-proximal to an invariant 
tryptophan are from one protein and residues carboxy- 
proximal to the invariant tryptophan are from the other 
3 0 protein. For example, one available chimer consists of 
a polypeptide derived from the amino-proximal region of 
OspA from strain B31, followed by the tryptophan 
residue, followed by a polypeptide derived from the 
carboxy-proximal region of OspA from strain K48 (SEQ ID 
35 NO. 92). Another available chimer includes a 
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polypeptide derived from the amino-proximal region of 
OspA from strain B31, and a polypeptide derived from the 
carboxy-proximal region of OspB from strain B31 (SEQ ID 
NO. 104) If the polypeptide proximal to the tryptophan 
5 of these chimeric proteins is derived from OspA, the 

proximal polypeptide can be further subdivided into the 
three hypervariable domains (Domains A, B, and C) , each 
of which can be derived from OspA from a different 
strain of Borrelia. These chimeric proteins can further 
10 comprise antigenic polypeptides from another protein, in 
addition to the antigenic polypeptides flanking the 
tryptophan residue. 

In another embodiment of the current invention, 
chimeric proteins are available which incorporate 
15 antigenic domains of two or more Borrelia proteins, such 
as Osp proteins (Osp A, B, C and/or D) as well as pl2, 
p39, p41, p66, and/or p93. 

The chimers described herein can be produced so 
that they are highly soluble, hyper-produced in E. coli, 
20 and non-lipidated. In addition, the chimeric proteins 
can be designed to end in an affinity tag (His-tag) to 
facilitate purification. The recombinant proteins 
described herein "have been constructed to maintain high 
levels of antigenicity. In addition, recombinant 
25 proteins specific for the various genospecies of 

Borrelia that cause Lyme disease are now available, 
because the genes from each of the major genospecies 
have been sequenced; the sequences are set forth below. 
These recombinant proteins with their novel biophysical 
3 0 and antigenic properties will be important diagnostic 
reagent and vaccine candidates. 

The chimeric proteins of the current invention are 
advantageous in that they retain specific reactivity to 
monoclonal and polyclonal antibodies against wild-type 
3 5 Borrelia proteins, are immunogenic, and inhibit the 
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growth or induce lysis of Borrelia in vitro. 
Furthermore, in some embodiments, the proteins provide 
antigenic domains of two or more Borrelia strains and/or 
proteins within a single protein. Such proteins are 
5 particularly useful in immuno-diagostic assays. For 
example, proteins of the present invention can be used 
as reagents in assays to detect the presence of 
antibodies to native Borrelia in potentially infected 
individuals. These proteins can also be used as 
10 immunodiagnostic reagents, such as in dot blots, Western 
blots, enzyme linked immunosorbed assays, or 
agglutination assays. The chimeric proteins of the 
present invention can be produced by known techniques, 
such as by recombinant methodology, polymerase chain 
15 reaction, or mutagenesis. 

Furthermore, the proteins of the current invention 
are useful as vaccine immunogens against Borrelia 
infection. Because Borrelia has been shown to be 
clonal, a protein comprising antigenic polypeptides from 
20 a variety of Borrelia proteins and/or species, will 

provide immunoprotection for a considerable time when 
used in a vaccine. The lack of significant intragenic 
recombination, a process which might rapidly generate 
novel epitopes with changed antigenic properties, 

2 5 ensures that Borrelia can only change antigenic type by 

accumulating mutational change, which is slow when 
compared with recombination in generating different 
antigenic types. The chimeric protein can be combined 
with a physiologically acceptable carrier and 

3 0 administered to a vertebrate animal through standard 

methods (e.g., intravenously or intramuscularly, for 
example) . 
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The current invention is illustrated by the 
following Examples, which are not to be construed to be 
limiting in any way. 



This example details a method for the purification 
of large amounts of native outer surface protein A 
(OspA) to homogeneity, and describes mapping of the 
antigenic specificities of several anti-OspA MAbs. OspA 
was purified to homogeneity by exploiting its resistance 
to trypsin digestion. Intrinsic labeling with l4 C- 
palmitic acid confirmed that OspA was lipidated, and 
partial digestion established lipidation at the amino- 
terminal cysteine of the molecule. 

The reactivity of seven anti-OspA murine monoclonal 
antibodies to nine different Borrelia isolates was 
ascertained by Western blot analysis. Purified OspA was 
fragmented by enzymatic or chemical cleavage, and the 
monoclonal antibodies were able to define four distinct 
immunogenic domains (see Figure 1). Domain 3, which 
included residues 190-220 of OspA, was reactive with 
protective antibodies known to agglutinate the organism 
in vitro, and included distinct specificities, some of 
which were not restricted to a genotype of B . 
burgdorferi . 



Exam ple 1 . 



Purification of Borrelia buraorferi Outer 
Surface Protein A and Analys is of 
Antibody Binding Domains 
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A. Purification of Native OspA 

Detergent solubilization of B . burgdorferi strips 
the outer surface proteins and yields partially-purified 
preparations containing both OspA and outer surface 
5 protein B (Osp B) (Barbour, A.G. et al . , Infect. Immun. 
52 (5) : 549-554 (1986) ; Coleman, J.L. and J.L. Benach, J 
Infect. Pis. 155 (4) : 756-765 (1987); Cunningham, T . M . 
et al., Ann. NY Acad. Sci. 539 : 376-378 (1988); Brandt, 
M.E. et al., Infect. Immun. 58 : 983-991 (1990); Sambri, 

10 V. and R- Cevenini, Microbiol. 14 : 307-314 (1991)). 

Although both OspA and OspB are sensitive to proteinase 
K digestion, in contrast to OspB, OspA is resistant to 

cleavage by trypsin (Dunn, J. et al . , Prot . Exp . Pur if . 

1: 159-168 (1990); Barbour, A.G. et al . . Infect. Immun. 

15 45:94-100 (1984)). The relative insensitivity to 

trypsin is surprising in view of the fact that Osp A has 
a high (16% for B31) lysine content, and may relate to 
the relative configuration of Osp A and B in the outer 
membrane . 

2 0 Intrinsic Radiolabeling of Borrelia 

Labeling for lipoproteins was performed as 
described by Brandt et al . ( Infect. Immun. 58:983-991 
(1990)). 14 C-palmitic acid (ICN, Irvine, California) was 
added to the BSK II media to a final concentration of 

25 0.5 MCi per milliliter (ml). Organisms were cultured at 
34 °C in this medium until a density of 10* cells per ml 
was achieved. 

Purification of OspA Protein from Borrelia Strain B31 
Borrelia burgdorferi, either 14 C-palmitic acid- 
3 0 labeled or unlabeled, were harvested and washed as 

described (Brandt, M.E. et al . , Infect. Immun. 58:983- 
991 (1990)). Whole organisms were trypsinized according 
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to the protocol of Barbour et al . (Infect. immun. 45:94- 

100 (1984)) with some modifications . The pellet was 
suspended in phosphate buffered saline (PBS, lOmM, pH 
7.2) , containing 0.8% tosyl-L-phenylalanine chloromethyl 
5 ketone (TPCK) -treated trypsin (Sigma, St. Louis, 

Missouri), the latter at a ratio of 1 Mg per 10 8 cells. 
Reaction was carried out at 25°C for 1 hour, following* 
which the cells were centrifuged. The pellet was washed 
in PBS with 100 fig/ml phenylmethylsulf onyl fluoride 
10 (PMSF) . Triton X-114 partitioning of the pellet was 
carried out as described by Brandt et al . (Infect. 
Immun. 58 :983-991 (1990)). Following trypsin treatment, 
cells were resuspended in ice-cold 2% (v/v) Triton X-114 
in PBS at 10 9 cells per ml. The suspension was rotated 
15 overnight at 4°C, and the insoluble fraction removed as 
a pellet after centrif ugation at 10,000 X g for 15 
minutes at 4°C. The supernatant (soluble fraction) was 
incubated at 37 °C for 15 minutes and centrifuged at room 
temperature at 1000 X g for 15 minutes to separate the 
2 0 aqueous and detergent phases. The aqueous phase was 
decanted, and ice cold PBS added to the lower Triton 
phase, mixed, warmed to 37 °C, and again centrifuged at 
1000 X g for 15 minutes. Washing was repeated twice 
more. Finally, detergent was removed from the 
25 preparation using a spin column of Bio-beads SM2 

(BioRad, Melville, New York) as described (Holloway, 
P.W., Anal. Biochem. 53 :304-308 (1973)). 

Ion exchange chromatography was carried out as 
described by Dunn et al . f Prot. Ex p. Pur if. 1: 159-168 
30 (1990)) with minor modifications. Crude OspA was 

dissolved in buffer A (1% Triton X-100, lOmM phosphate 
buffer (pH 5.0)) and loaded onto a SP Sepharose resin 
(Pharmacia, Piscataway, New Jersey) , pre-equilibrated 
with buffer A at 25 °C. After washing the column with 10 
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bed-volumes of buffer A, the bound OspA was eluted with 
buffer B (1% Triton X-100, lOmM phosphate buffer (pH 
8.0))* OspA fractions were detected by protein assay 
using the BCA method (Pierce, Rockford, Illinois) , or as 
5 radioactivity when intrinsically labeled material was 
fractionated. Triton X-100 was removed using a spin 
column of Bio-beads SM2. 

This method purifies OspA from an outer surface 
membrane preparation. In the absence of trypsin- 

10 treatment, OspA and B were the major components of the 
soluble fraction obtained after Triton partitioning of 
strain B31. In contrast, when Triton extraction was 
carried out after trypsin-treatment , the OspB band is 
not seen. Further purification of 0spA-B31 on a SP 

15 Sepharose column resulted in a single band by SDS-PAGE. 
The yield following removal of detergent was 
approximately 2 mg per liter of culture. This method of 
purification of OspA, as described herein for strain 
B31, can be used for other isolates of Borrelia as well. 

20 For strains such as strain K48, which lack OspB, trypsin 
treatment can be omitted. 

Lipidation site of OspA-B31 

l4 C-palmitic acid labeled OspA from strain B31 was 
purified as described above and partially digested with 

25 endoproteinase Asp-N (data not shown) . Following 
digestion, a new band of lower molecular weight was 
apparent by SDS-PAGE, found by direct amino-terminal 
sequencing to begin at Asp^. This band had no trace of 
radioactivity by autoradiography (data not shown) . OspA 

30 and B contain a signal sequence (L-X-Y-C) similar to the 
consensus described for lipoproteins of E. coli , and it 
has been predicted that the lipidation site of OspA and 
B should be the amino-terminal cysteine (Brandt, M.E. et 
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al., Infect, Immun 58 : 983-991 (1990)). The results 
presented herein support this prediction. 

B. comparison of OspA Antibody Binding Regions in Nine 
strains of Borrelia burgdorferi 
5 The availability of the amino acid sequenced for 

OspA from a number of different isolates, combined with 
peptide mapping and Western blot analysis, permitted the 
identification of the antigenic domains recognized by 
monoclonal antibodies (MAbs) and allowed inference of 
10 the key amino acid residues responsible for specific 
antibody reactivity. 

Strains of Borrelia burgdorferi 

Nine strains of Borrelia, including seven European 
strains and two North American strains, were used in 
15 this study of antibody binding domains of several 
proteins. Information concerning the strains is 
summarized in Table I, below. 
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Table I. Representative Borrelia Strains 



Strain 



Location and Source 



Reference for Strain 



K4 8 



Czechoslovakia , 
Ixodes ricinus 



none 



PGau 



Germany, human ACA 



Wilske, B. et al., 
Microbiol . 32 : 34 0-350 



Clin, 



(1993) 



DK2 9 



Denmark , human EM 



Wilske, B. et al . 



PKo 



Germany, human EM 



Wilske, B. et al . 



PTrob 



Germany, human skin 



Wilske, B. et al 



Ip3 



Khabarovsk , Russia , 
j. persulcatus 



Asbrink, E . et al. , 
Derm . Venereol . 64 



Acta 
506-512 



(1984) 



Ip90 



Khabarovsk , Russ ia , 
J. persulcatus 



Asbrink, E. et al 



25015 



Millbrook, NY, J. 
persulcatus 



Barbour, A.G. et al., Curr. 
Microbiol. 8 : 123-126 (1983) 



B31 



Shelter Island, NY, 
J. scapularis 



Luft, B.J. 
Tmmun . 60: 



et al . , Infect 
4309-4321 



(1992) ; ATCC 35210 



PKal 



Germany, human CSF 



Wilske, B. et al . 



ZS7 



Fr e iburg , Germany , 
J. ricinus 



Wallich, R. et al . , Nucl . 
Acids Res. 17: 8864 (1989) 



N4 0 



Westchester Co., NY 



Fikrig, E. et al . , Science 
250:553-556 (1990) 



PHei 



Germany, human CSF 



Wilske, B. et al 



ACAI 



Sweden, human ACA 



Luft, B. J. et al., FEMS 
Microbiol. Lett. 93 :73-68 
(1992) 



PBo 



Germany , human CSF 



Wilske, B. et al 



ACA = patient with acrodermatitis chronica atrophicans; 
EM = patient with erythema migrans; CSF = cerebrospinal 
fluid of patient with Lyme disease 



Strains K4 8, PGau and DK2 9 were supplied by R. 
Johnson, University of Minnesota; PKo and pTrob were 
provided by B. Wilske and V. Preac-Mursic of the 
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Pettenkhofer Institute, Munich, Germany; and Ip3 and Ip9 0 
were supplied by L . Mayer of the Center for Disease 
Control, Atlanta, Georgia. The North American strains 
included strain 25015, provided by J. Anderson of the 
5 Connecticut Department of Agriculture; and strain B31 (ATCC 
35210) . 

Monocl onal An tibodi es 

Seven monoclonal antibodies (MAbs) were utilized in 
this study . Five of the MAbs (12, 13, 15, 83 and 33 6) were 

10 produced from hybridomas cloned and subcloned as previously 
described (Schubach, W.H., et al . , Infect. Immun . 
59 (6) : 1911-1915 (1991)). MAbH5332 (Barbour, A. G . et al . , 
Infect. Immun. 41 :795-804 (1983)) was a gift from Drs . Alan 
Barbour, University of Texas, and MAb CIII.78 (Sears, J.E. 

15 et al., J. Immunol. 147 ( 6 ): 1995-2000 (1991)) was a gift 

from Richard A. Flavell, Yale University. MAbs 12 and 15 
were raised against whole sonicated B3 ; MAb 336 was 
produced against whole PGau; and MAbs 13 and 83 were raised 
to a truncated form of OspA cloned from the K48 strain and 

20 expressed in E. coli using the T7 RNA polymerase system 
(McGrath, B.C. et al . , Vaccines , Cold Spring Harbor 
Laboratory Press, Plainview, New York, pp. 365-370 (1993)). 
All MAbs were typed as being Immunoglobulin G (IgG) . 

Methods of Protein Cleavage, Western Blotting, and 

2 5 Amino -Teirminal Sequencing 

Prediction of the various cleavage sites was achieved 
by knowledge of the primary amino acid sequence derived 
from the full nucleotide sequences of OspA, many of which 
are currently available (see Table II, below) . Cleavage 

3 0 sites can also be predicted based on the peptide sequence 

of OspA, which can be determined by standard techniques 
after isolation and purification of OspA by the method 
described above. Cleavage of several OspA isolates was 
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conducted to determine the localization of monoclonal 
antibody binding of the proteins. 

Hydroxylamine-HCl (HA) , N-chlorosuccinimide (NCS) , and 
cyanogen bromide cleavage of OspA followed the methods 
5 described by Bornstein ( Biochem. 9 (12) :2408-2421 (1970)), 
Shechter et al . , ( Biochem , 15 (23) :5071-5075 (1976)), and 
Gross (in Hirs, C.H.W. (ed) : Methods in Enzvmoloav, (N.Y. 
Acad. Press), 11:238-255 (1967)) respectively. Protease 
cleavage by endoproteinase , Asp-N (Boehringer Mannheim, 
10 Indianapolis, Indiana) , was performed as described by 
Cleveland D.W. et al . , ( J. Biol. Chem. 252:1102-1106 
(1977)) . Ten micrograms of OspA were used for each 
reaction. The ratio of enzyme to OspA was approximately 1 
to 10 (w/w) . 

15 Proteins and peptides generated by cleavage were 

separated by SDS-polyacrylamide gel electrophoresis (SDS- 
PAGE) (Laemmli, U.K., Nature (London) 227:680-685 (1970)), 
and electroblotted onto immobilon Polyvinyl idine Difluoride 
(PVDF) membranes (Ploskal, M.G. et al., Biotechniaues „ 

20 4:272-283 (1986)). They were detected by amido black 

staining or by immunostaining with murine MAbs, followed by 
alkaline phosphatase -conjugated goat antimouse IgG. 
Specific binding was detected using a 5-bromo-4-chloro-3- 
indolylphosphate (BCIP) /nitroblue tetrazolium (NBT) 

25 developer system (KPL Inc., Gathersburg, Maryland). 

In addition, amino- terminal amino acid sequence 
analysis was carried out on several cleavage products, as 
described by Luft et al . ( Infect. Immun. 57:3637-3645 
(1989)). Amido black stained bands were excised from PVDF 

3 0 blots and sequenced by Edman degradation using a Biosystems 
model 475A sequenator with model 120A PTH analyzer and 
model 900A control/data analyzer. 



BNSDOCID* <WO 9512676A1> 



WO 95/12676 



PCT/US94/ 12352 



-22- 

Cleavage Products of Outer Surface Protein A Isolates 
Purified OspA-B31, labeled with 14 C-palmitic acid, was 
fragmented with hydroxylamine-HCl (HA) into two peptides, 
designated HA1 and HA2 (data not shown) . The HA1 band 
5 migrated at 27 KD and retained its radioactivity, 

indicating that the peptide included the lipidation site at 
the N-terminus of the molecule (data not shown) . From the 
predicted cleavage point, HA1 should correspond to residues 
1 to 251 of 0spA-B31. HA2 had a MW of 21.6 KD by SDS-PAGE, 
10 with amino- terminal sequence analysis showing it to begin 
at Gly72, i.e. residues 72 to 273 of 0spA-B31. By 
contrast, HA cleaved OspA-K48 into three peptides, 
designated HA1, HA2 , and HA3 with apparent MWs of 22KD, 16 
KD and 12 KD, respectively. Amino- terminal sequencing 
15 showed HA1 to start at Gly72, and HA3 at Glyl42. HA2 was 

found to have a blocked amino- terminus , as was observed for 
the full-length OspA protein. HA1, 2 and 3 of OspA-K4 8 
were predicted to be residues 72-274, 1 to 141 and 142 to 
274 , respectively. 
20 N-Chlorosuccinimide (NCS) cleaves tryptophan (W) , 

which is at residue 216 of 0spA-B31 or residue 217 of OspA- 
K4 8 (data not shown) . NCS cleaved OspA-B31 into 2 
fragments, NCS1, with MW of 23 KD, residues 1-216 of the 
protein, and NCS 2 with a MW of 6 . 2 KD, residues 217 to 273 
25 (data not shown) . Similarly, K48 OspA was divided into 2 
pieces, NCS1 residues 1-217, and NCS2 residues 218 to 274 
(data not shown) . 

Cleavage of OspA by cyanogen bromide (CNBr) occurs at 
the carboxy side of methionine, residue 39. The major 
30 fragment, CNBrl, has a MW of 25.7 KD, residues 39-274 by 
amino-terminal amino acid sequence analysis (data not 
shown) . CNBr2 (about 4 KD) could not be visualized by 
amido black staining; instead, lightly stained bands of 
about 20 KD MW were seen. These bands reacted with anti- 
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OspA MAbs, and most likely were degradation products due to 
cleavage by formic acid. 

Determination of Antibody Binding Domains for Anti- 
OspA Monoclonal Antibodies 
5 The cleavage products of 0spA-B31 and OspA-K4 8 were . 

analyzed by Western blot to assess their ability to bind to 
the six different MAbs. Preliminary Western blot analysis 
of the cleavage products demonstrated that strains K4 8 and 
DK29 have similar patterns of reactivity, as do IP3, PGau 
10 and PKo. The>0spA of strain PTrob was immunologically 

distinct from the others, being recognized only by MAb 336. 
MAb 12 recognized only the two North American strains, B31 
and 25015. When the isolates were separated into 
genogroups, it was remarkable that all the MAbs, except MAb 
15 12, crossed over to react with multiple genogroups. 

MAbl2, specific for 0spA-B31, bound to both HA1 and. 
HA2 of OspA-B31. However, cleavage of OspA-B31 by NCS at 
residue Trp216 created fragments which did not react with 
MAbl2, suggesting that the relevant domain is near or is 

2 0 structurally dependent upon the integrity of this residue 

(data not shown). MAb 13 bound only to OspA-K48, and to 
peptides containing the amino- terminus of that molecule 
(e.g. HA2 ; NCS1) . It did not bind to CNBrl residues 3 9 to 
274. Thus the domain recognized by MAbl3 is in the amino- 
25 terminal end of OspA-K48, near Met38. 

MAbl5 reacts with the OspA of both the B31 and K48 
strains, and to peptides containing the N-terminus of OspA, 
such as HA1 of OspA-B31 and NCS1, but not to peptides HA2 
of OspA-B31 and HA1 of OspA-K4 8 (data not shown) . Both 

3 0 peptides include residue 72 to the C-terminus of the 

molecules. MAblS bound to CNBrl of OspA-K4 8, indicating 
the domain for this antibody to be residues 3 9 to 72, 
specifically near Gly72 (data not shown) . 
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MAb83 binds to OspA-K48, and to peptides containing 
the C-terminal portion of the molecule, such as HA1 . They 
do not bind to HA2 of OspA-K4 8, most likely because the C- 
terminus of HA2 of OspA-K48 ends at 141. Similar to MAbl2 
5 and OspA-B31, binding of MAbs 83 and CIII.78 is eliminated 
by cleavage of OspA at the tryptophan residue. Thus 
binding of MAbs 12, 83 and CIII.78 to OspA depends on the 
structural integrity of the Trp 216 residue, which appears 
to be critical for antigenicity. Also apparent is that, 
10 although these MAbs bind to a common antigenic domain, the 
precise epitopes which they recognize are distinct from one 
another given the varying degrees of cross- reactivity to 
these MAbs among strains . 

Although there is similar loss of binding activity of 
15 MAb336 with cleavage at Trp 216 , this MAb does not bind to 
HAl of OspA-B3l, suggesting the domain for this antibody 
includes the carboxy- terminal end of the molecule, 
inclusive of residues 251 to 273. Low MW peptides, such as 
HA3 (10 KD) and NCS2 (6KD), of OspA-K48 do not bind this 
20 MAb on Western blots. In order to confirm this 

observation, we tested binding of the 6 MAbs with a 
recombinant fusion construct P 3A/EC that contains a trpE 
leader protein fused with residues 217 to 273 of 0spA-B31 
(Schubach, W.H. et stl., Tnf act . Immun. 59 (6) : 1911-1915 
25' (199D). Only MAb336 reacted with this construct (data not 
shown) . Peptides and antigenic domains localized by 
fragmentation of OspA are summarized in Figure 1. 

Mapping of Domains to Define the Molecular Basis for 
the Serotype Analysis 
30 to define the molecular basis for the serotype 

analysis of OspA, we compared the derived amino acid 
sequences of OspA for the nine isolates (Figure 2) . At the 
amino terminus of the protein, these predictions can be 
more precise given the relatively small number of amino 
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acid substitutions in this region compared to the carboxy 
terminus. Domain 1 , which is recognized by MAbl3, includes 
residues Leu34 to Leu41. MAbl3 only binds to the OspA of 
species K48, DK29 and IP90 . Within this region, residue 37 

5 is variable, however Gly37 is conserved amongst the three 
reactive strains. When Gly3 7 is changed to Glu37, as it is 
in OspA of strains B31, pTrob, PGau, and PKo, MAbl3 does 
not recognize the protein (data not shown) . By similar 
analysis, it can be seen that Asp70 is a crucial residue 

10 for Domain 2 , which includes residues 65 to 75 and is 

recognized by MAblB . Domain 3 is reactive with MAbs H5332, 
12 and 83, and includes residues 190-220.. It is clear that 
significant heterogeneity exists between MAbs reactive with 
this domain, and that more than one conformational epitope 

15 must be contained within the sequence. Domain 4 binds 

MAb336, and includes residues 250 to 270. In this region, 
residue 266 is variable and therefore may be an important 
determinant. It is apparent, however, that other 
determinants of the reactivity of this monoclonal antibody 

20 reside in the region comprising amino acids 217-250. 
Furthermore, the structural integrity of Trp216 is 
essential for antibody reactivity in the intact protein. 
Finally, it is important to stress that Figure 2 indicates 
only the locations of the domains, and does not necessarily 

25 encompass the entire domain. Exact epitopes are being 

analyzed by site-directed mutagenesis of specific residues. 

Overall, evidence suggests that the N-terminal portion 
is not the immunodominant domain of OspA, possibly by 
virtue of its lipidation, and the putative function of the 
30 lipid moiety in anchoring the protein to the outer 
envelope. The C-terminal end is immunodominant and 
includes domains that account in part for structural 
heterogeneity (Wilske, B. et al . , Med. Microbiol. Immunol. 
181 : 191-207 (1992)), and may provide epitopes for antibody 
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neutralization (Sears, J.E. et al . , Immunol^ 147 ( g ) : 

1995-2000 (1991)), and relate to other activities, such as 
the induction of T-cell proliferation (Shanafel, M.M. , et 
al., »T. Immunol. 14 8 : 218-224 (1992)). There are common 
5 epitopes in the carboxy-end of the protein that are shared 
among genospecies which may have immunoprotective potential 
(Wilske, B., et al . , Med. Microbiol. Immunol. 181: 191-207 
(1992) ) . 

Prediction of secondary structure on the basis of 
10 hydropathy analysis and circular dichroism and fluorescence 
spectroscopy measurements (McGrath, B.C., et al . , Vaccines, 
Cold Spring Harbor Laboratory Press, Plainview, New York; 
pp. 365-370 (1993)) suggest domains 3 and 4 to.be in a 
region of the molecule with a propensity to form alpha- 
15 helix, whereas domains 1 and 2 occur in regions predicted 
to be beta-sheets (see Figure 1) . These differences may 
distinguish domains in accessibility to antibody or to 
reactive T-cells (Shanafel, M.M- et al . , J. Immunol. 148.: 
218-224 (1992)). Site -directed mutagenesis of specific 
20 epitopes, as described below in Example 2, aids in 
identifying exact epitopes. 



Example 2 . Identification of an Immunol ogically 

Important Hypervariable Domain o f the Major 
Outer Surface Protein A of Barrel la 
25 This Example describes epitope mapping studies using 

chemically cleaved OspA and TrpE-OspA fusion proteins. The 
studies indicate a hypervariable region surrounding the 
single conserved tryptophan residue of OspA (at residue 
216, or in some cases 217), as determined by a moving 
3 0 window population analysis of OspA from fifteen European 

and North American isolates of Borrelia. The hypervariable 
region is important for immune recognition. 
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Site-directed mutagenesis was also conducted to 
examine the hypervariable regions more closely. 
Fluorescence and circular dichroism spectroscopy have 
indicated that the conserved tryptophan is part of an 
5 alpha-helical region in which the tryptophan is buried in a 
hydrophobic environment (McGrath, B.C., et al . , Vaccines, . 
Cold Spring Harbor Laboratory Press, Plainview, New York; 
pp. 365-370 (1993)). More polar amino acid side-chains 
flanking the tryptophan are likely to be exposed to the 

10 hydrophilic solvent. The hype rvar lability of these 
solvent -exposed residues among the various strains of 
Borrelia. suggested that these amino acid residues may 
contribute to the antigenic variation in OspA. Therefore, 
site-directed mutagenesis was performed to replace some of 

15 the potentially exposed amino acid side chains in the 

protein from one strain with the analogous residues of a 
second strain. -The altered proteins were then analyzed by 
Western Blot using monoclonal antibodies which bind OspA on 
the surface of the intact, non-mutated spirochete. The 

20 results indicated that certain specific amino acid changes 
near the tryptophan can abolish reactivity of OspA to these 
monoclonal antibodies . 

A . Verification of Clustered Polymorphi sms in Outer 
Surface Protein A Sequences 
25 Cloning and sequencing of the OspA protein from 

fifteen European and North American isolates (described 
above in Table I) demonstrated that amino acid polymorphism 
is not randomly distributed throughout the protein; rather, 
polymorphism tended to be clustered in three regions of 
3 0 OspA. The analysis was carried out by plotting the moving, 
weighted average polymorphism of a window (a fixed length 
subsection of the total sequence) as it is slid along the 
sequence. The window size in this analysis was thirteen 
amino acids, based upon the determination of the largest 
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number of significantly deviating points as established by 
the method of Tajima M. Mnl . Evol . 33: 470-473 (1991)). 
The average weighted polymorphism was calculated by summing 
the number of variant alleles for each site. Polymorphism 
5 calculations were weighted by the severity of amino acid 
replacement (Dayhoff, M.O. et al . , in: Dayhoff, M.O. (ed.) 
AM as of Protein Sequence and Str icture NBRF, Washington, 
vol . 5 . SuppI . 3 : 345 (1978)). The sum was normalized by 
the window size and plotted. The amino acid sequence 
10 position corresponds to a window that encompasses amino 
acids 1 through 13. Bootstrap resampling was used to 
generate 95% confidence intervals on the sliding window 
analysis. Since Borrelia has been shown to be clonal, the 
bootstrap analysis should give a reliable estimate of the 
15 expected variance out of polymorphism calculations. The 

bootstrap was iterated five hundred times at each position, 
and the mean was calculated from the sum of all positions. 
The clonal nature of Borrelia ensures that the stochastic 
variance that results from differing genealogical histories 
20 of the sequence positions (as would be expected if 
recombination were prevalent) will be minimized. 

This test verified that the three regions around the 
observed peaks all have significant excesses of 
polymorphism. Excesses of polymorphism were observed in 
25 the regions including amino acid residues 132-145, residues 
163-177, and residues 208-221 (Figure 3) . An amino acid 
alignment between residues 200 and 220 for B31, K48 and the 
four site-directed mutants is shown in Figure 4. The amino 
acid 208-221 region includes the region of OspA which has 
30 been modeled as an oriented alpha-helix in which the single 
tryptophan residue at amino acid 216 is buried in a 
hydrophobic pocket, thereby exposing more polar amino acids 
to the solvent (Figure 5) (France, L.L., et al . , Biochem. 
Biophvs. Acta 1120 : 59 (1992)). These potentially solvent - 
35 exposed residues showed considerable variability among the 
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OspAs from various strains and may be an important 
component of OspA antigenic variation. For the purposes of 
generating chimeric proteins, the hypervariable domains of 
interest are Domain A , which includes amino acid residues 
5 120-140 of OspA; Domain B , which includes residues 150-180; 
and Domain C , which includes residues 200-216 or 217. 

B . site-Directed Mutagenesis of th e Hypervariable Region 

Site-directed mutagenesis was performed to convert 
residues within the 204-219 domain of the recombinant B31 

10 OspA to the analogous residues of a European OspA variant, 
K48. In the region of OspA between residues 204 and 219, 
which includes the helical domain (amino acids 204-217) , 
there are seven amino acid differences between OspA-B31 and 
OspA-K48. Three oligonucleotides were generated, each 

15 containing nucleotide changes which would incorporate K48 
amino acids at their analogous positions in the B31 OspA 
protein ] The oligos used to create the site-directed 
mutants were : 

5 ' - CTTAATGACTCTGACACTAGTGC - 3 ' (#613 , which converts 
20 threonine at position 204 to serine, and serine at 206 to 
threonine (Thr204-Ser, Thr206-Ser) ) (SEQ ID NO. 1) ; 

5 ' -GCTACTAAAAAAACCGGGAAATGGAATTCA- 3 ' (#625, which converts 
alanine at 214 to glycine, and alanine at 215 to lysine 
(Ala214-Gly, Ala215-L.ys) ) (SEQ ID NO. 2); and 
25 5 ' -GCAGCTTGGGATTCAAAAACATCCACTTTAACA- 3 ' (#640, which 

converts asparagine at 217 to aspartate, and glycine at 
219 to lysine (Asn217-Asp, Gly219 -Lys) ) (SEQ ID NO. 3). 

Site-directed mutagenesis was carried out by 
performing mutagenesis with pairs of the above oligos. 
3 0 Three site-directed mutants were created, each with two 
changes: OspA 613 (Thr204-Ser, Thr206-Ser) , OspA 625 
(Ala214-Gly, Ala215-Lys) , and 640 (Asn217-Asp, Gly219-Lys) . 
There were also two proteins with four changes: OspA 
613/625 (Thr204-Ser, Thr206-Ser, Ala214-Gly, Ala215-Lys) 
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and OspA 613/640 (Thr204-Ser, Thr206-Ser, Asn217 -Asp , 
Gly219-Lys) . 

Specificity of Antibody Binding to Epitopes of the 
Non-mutated Hypervariable Region 
5 Monoclonal antibodies that agglutinate spirochetes, 

including several which are neutralizing in vitro, 
recognize epitopes that map to the hypervariable region 
around Trp216 (Barbour, A.G. et aL, Infect, and Imm un. 41: 
759 (1983); Schubach, W.H. et al . , Infect, and Immun. 59: 

10 1911 (1991) ) . Western Blot analysis demonstrated that 

chemical cleavage of OspA from the B31 strain at Trp 216 
abolishes reactivity of the protein with the agglutinating 
Mab 105, a monoclonal raised against B31 spirochetes (data 
not shown) . The reagent, n-chlorosuccinimide (NCS) , 

15 cleaves OspA at the Trp 216, forming a 23.2kd fragment and 
a 6.2kd peptide which is not retained on the Imobilon-P 
membrane after transfer. The uncleaved material binds Mab 
105; however, the 23.2kd fragment is unreactive. Similar 
Western blots with a TrpE-OspA fusion protein containing 

2 0 the carboxy- terminal portion of the OspA protein 

demonstrated that the small 6 . 2kd piece also fails to bind 
Mab 105 (Schubach, W.H. et al . , Infect, and Immun. 59 : 1911 
(1991) ) . 

Monoclonal antibodies H5332 and H3TS (Barbour, A.G. et 
25 al., Infect, and Immun. 41 : 759 (1983)) have been shown by 
immunofluorescence to decorate the surface of fixed 
spirochetes (Wilske, B. et al . , World J. Microbiol. 7 : 130 
(1991)). These monoclonals also inhibit the growth of the 
organism in culture. Epitope mapping with fusion proteins 

3 0 has confirmed that the epitopes which bind these Mabs are 

conf ormationally determined and reside in the carboxy half 
of the protein. Mab H5332 is cross-reactive among all of 
the known phylogenetic groups, whereas Mab H3TS and Mab 105 
seem to be specific to the B31 strain to which they were 
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raised. Like Mab 105, the reactivities of H5332 and H3TS 
to OspA are abrogated by fragmentation of the protein at 
Trp216 (data not shown) . Mab 336 was raised to whole 
spirochetes of the strain P/Gau. It cross-reacts to OspA 
5 from group 1 (the group to which B31 belongs) but not to 

group 2 (of which K48 is a member) . ; Previous studies using 
fusion proteins and chemical cleavage have indicated that 
this antibody recognizes a domain of OspA in the region 
between residues 217 and 273 (data not shown) . All of 
10 these Mabs will agglutinate the B31 spirochete. 

Western Blot Analysis of Antibody Binding to Mutated 
Hypervariable Regions 

Mabs were used for Western Blot analysis of the site- 
directed OspA mutants induced in E.coli using the T7 
15 expression system (Dunn, J.J. et ah, Protein Expression 
and Purification 1 : 159 (1990)). E. coll cells carrying 
Pet9c plasmids having a site-directed OspA mutant insert 
were induced at mid- log phase growth with IPTG for four., 
hours at 37 °C. Cell lysates were made by boiling an 
2 0 aliquot of the induced cultures in SDS gell loading dye, 
and this material was then loaded onto a 12% SDS gell 
(BioRad mini -Protean II) , and electrophoresed. The 
proteins were then transferred to Imobilon-P membranes 
(Millipore) 70V, 2 hour at 4°C using the BioRad mini 
25 transfer system. Western analysis was carried out as 
described by Schubach et al . ( Infect. Immun. 59: 1911 
(1991) ) . 

Western Blot analysis indicated that only the 625 
mutant (Ala214-Gly and Ala215-Lys) retained binding to the 
30 agglutinating monoclonal H3TS (data not shown) . However, 
the 613/625 mutant which has additional alterations to the 
amino terminus of Trp216 (Ser204-Thr and Thr206-Ser) did 
not bind this monoclonal. Both 640 and 613/640 OspAs which 
have the Asn217-Asp and Gly219-Lys changes on the carboxy- 
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terminal side of Trp216 also failed to bind Mab H3TS . This 
indicated that the epitope of the B31 OspA which binds H3TS 
is comprised of amino acid side-chains on both sides of 
Trp216 . 

5 The 613/625 mutant failed to bind Mabs 105 and H5332, 

while the other mutants retained their ability to bind 
these Mabs. This is important in light of the data using 
fusion proteins that indicate that Mab 105 behaves more 
like. Mab H3TS in terms of its serotype specificity and 
10 binding to OspA (Wilske, B. et-al., Med. Microbiol. 

Immunol. 181 ; 191 (1992)). The 613/625 protein has, in 
addition to the differences at residues Thr204 and Ser206 / 
changes immediately amino- terminal to Trp216 (Ala214-Gly 
and Ala215-Lys) . The abrogation of reactivity of Mabs 105 
15 and H5332 to this protein indicated that the epitopes of 

OspA which bind these monoclonals are comprised of residues 
on the amino- terminal side of Trp216 . 

The two proteins carrying the Asn217-Asp and Gly219- 
Lys replacements on the carboxy- terminal side of Trp216 
20 (OspAs 640 and 613/640) retained binding to Mabs 105 and 
H5332; however, they failed to react with Mab 336, a 
monoclonal which has been mapped with TrpE-OspA fusion 
proteins and by chemical cleavage to a more carboxy- 
terminal domain. This result may explain why Mab 336 
25 failed to recognize the K48-type of OspA (Group 2) . 

It is clear that amino acids Ser204 and Thr206 play an 
important part in the agglutinating epitopes in the region 
of the B31 OspA flanking Trp216. Replacement of these two 
residues altered the epitopes of OspA that bind Mabs 105, 
30 H3TS and H5332. The ability of the 640 changes alone to 
abolish reactivity of Mab 336 indicated that Thr204 and 
Ser206 are not involved in direct interaction with Mab 336. 

The results indicated that the epitopes of OspA which 
are available to Mabs that agglutinate spirochetes are 
3 5 comprised at least in part by amino acids in the immediate 
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vicinity of Trp216. Since recent circular dichroism 
analysis indicated that the structures of B31 and K4 8 OspA 
differ very little within this domain, it is unlikely that 
the changes made by mutation have radically altered the 
5 overall structure of the OspA protein (France, L.L. et al., 
Biochem. Biotphys- Acta 1120 : 59 (1992); and France et al . , 
Biochem. Riophvs Acta , submitted (1993)). This hypothesis 
is supported by the finding that the recombinant, mutant 
OspAs exhibit the same high solubility and purification 
10 properties as the parent B31 protein (data not shown) . 

In summary, amino acid side-chains at Ser204 and 
Thr206 are important for many of the agglutinating 
epitopes. However, a limited set of conservative changes 
at these sites were not sufficient to abolish binding of 
15 all of the agglutinating Mabs . These results suggested 

that the agglutinating epitopes of OspA are distinct, yet 
may have some overlap. The results also supported the 
hypothesis that the surface-exposed epitope around Trp2l6 
which is thought to be important for immune recognition and 
neutralization is a conf ormationally-determined and complex 
domain of OspA. 



20 



EXAMPLE 3 . Borrelia Strains and Proteins 

Proteins and genes from any strain of Borrelia. can be 
utilized in the current invention. Representative strains 
25 are summarized in Table I, above. 

A. Genes Encoding Borrelia Proteins 

The chimeric peptides of the current invention can 
comprise peptides derived from any Borrelia. proteins. 
Representative proteins include OspA, OspB, OspC, OspD, 
30 pl2, p39, p41 (fla) , p66, and p93 . Nucleic acid sequences 
encoding several Borrelia proteins are presently available 
(see Table II, below); alternatively, nucleic acid 
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sequences encoding Borrelia proteins can be isolated and 
characterized using methods such as those described below. 

Table II. References for Nucleic Acid Sequences for Several 
Proteins of Various Borrelia Strains 



Strai 
n 


P 93 


OspA 


p41 (fla) 


K4 8 


X69602 (SID 67) 


X62624 (SID 8) 


X69610 (SID 49) 


PGau 


SID 73 


X62387 (SID 10) 


X69612 (SID 51) 


DK2 9 




AO jhi^ \oiy J- -j s / 


X69608 (SID 53) 


PKo 


X69803 (SID 77) 


X65599 (SID 141) 


X69613 (SID 
131) 


PTrob 


X69604 (SID 71) 


X65598 (SID 135) 


X69614 (SID 55) 


Ip3 




X70365 (SID 140) 




Ip90 


ND 


Kryuchechnikov, V.N. 
et al.. J.Microbiol. 
Epid. Immunobiol. 
12:41-44 (1988) (SID 
138) 


- 


25015 


X70365 (SID 75) 


Fikrig, E.S. et al . , 
J. Immunol . 7:2256- 
2260 1992) 
SID 12) 




B31 


Perng, G.C et 
al . . Infect . 


Bergstrom, S. et 
al.. Mol . Microbiol. 


Gassmann, G.S. 
et al . . Nucl . 


Tmmnn CQ - 9 070- 


3 : 479-486 (1989) 
(SID 6) 


Acids Res. 17: 


74 (1992) ; 
Luft, B.J. et 
al . . Infect. 
Immun. 60:4309- 
4321 (1992) 
(SID 65) 


3590 (1989) 
(SID 127) 


PKal 




X69606 (SID 132) 


X69611 (SID 
129) 


ZS7 




Jonsson, M. et al . , 
Infect. Immun. 
60:1845-1853 (1992) 
(SID 134) 




N4 0 




Kryuchechnikov, V.N. 
et al. (SID 133) 




PHei 




X65600 (SID 136) 




ACAI 




Kryuchechnikov, V.N. 
et al. (SID 142) 




PBo 


X69601 (SID 69) 


X65605 (SID 139) 


X69610 (SID 
130) 



Numbers with an "X" prefix are GenBank aata base accession numbers . 
SID = SEQ ID NO. 
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B . Isolation of Barrel ia Genes 

Nucleic acid sequences encoding full^ length, lipidated 
proteins from known Borrelia strains were isolated using 
the polymerase chain reaction (PCR) as described below. In 
5 addition, nucleic acid sequences were generated which 
encoded truncated proteins (proteins in which the 
lipidation signal has been removed, such as by eliminating 
the nucleic acid sequence encoding the first 18 amino 
acids, resulting in non-lipidated proteins) . Other 
10 proteins were generated which encoded polypeptides of a 
particular gene (i.e., encoding a segment of the protein 
which has a different number of amino acids than the 
protein does in nature) . Using similar methods as those' 
described below, primers can be generated from known 
15 nucleic acid sequences encoding Borrelia proteins and used 
to. isolate other genes encoding Borrelia proteins. Primers 
can be designed to amplify all of a gene, as well as to 
amplify a nucleic acid sequence encoding truncated protein 
sequences, such as described below for OspC, or nucleic „ 
2 0 acid sequences encoding a polypeptide derived from a 
Borrelia protein. Primers can also be designed to 
incorporate unique restriction enzyme cleavage sites into 
the amplified nucleic acid sequences. Sequence analysis of 
the amplified nucleic acid sequences can then be performed 
25 using standard techniques. 

Cloning and Sequencing of OspA Genes and Relevant 
Nucleic Acid Sequences 

Borrelia OspA sequences were isolated in the following 
manner: 100 fil reaction mixtures containing 50 mM KCl, 10 
30 mM TRIS-HC1 (pH 8,3), 1.5 mM MgCl 2 , 200 jiM each NTP, 2.5 

units of TaqI DNA polymerase (Amplitaq, Perkin-Elmer/Cetus) 
and 100 pmol each of the 5' and 3' primers (described 
below) were used. Amplification was performed in a Perkin- 
Elmer/Cetus thermal cycler as described (Schubach, W.H, et 
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al., Infect. Immun. 59 :1811-1915 (1991)). The amplicon was 
visualized on an agarose gel by ethidium bromide staining. 
Twenty nanograms of the chloroform-extracted PCR product 
were cloned directly, into the PC-TA vector (Invitrogen) by 
5 following the manufacturer's instructions. Recombinant 
colonies containing the amplified fragment were selected, 
the plasmids were prepared, and the nucleic acid sequence 
of each OspA was determined by the dideoxy chain- 
termination technique using the Sequenase kit (United 
10 States Biochemical) . Directed sequencing was performed 

with M13 primers followed by OspA- specif ic primers derived 
from sequences, previously obtained with M13 primers. 

Because the 5' and 3' ends of the OspA gene are highly 
conserved (Fikrig, E.S. et al . , J. Immunol . 7 : 2256-22 60 
15 (1992); Bergstrom, S. et al . , Mol . Microbiol. 3 : 479-486 

(1989); Zumstein, G. et al . , Med. Microbiol . Immuno l . 181 : 
57-70 (1992)), the 5' and 3' primers for cloning can be 
based upon any known OspA sequences. For example, the 
following primers based upon the OspA nucleic acid sequence 
20 from strain B31 were used: 

5 ' -GGAGAATATATTATGAAA-3 ' (-12 to +6) (SEQ ID NO. 4); and 
5 ' - CTCCTTATTTTAAAGCG- 3 ' (+826 to +809) (SEQ ID NO. 5). 
(Schubach, W.H. et al . , Infect. Immun 59 :1811-1915 (1991)). 
OspA genes isolated in this manner include those for 
25 strains B31, K48, PGau, and 25015; the nucleic acid 

sequences are depicted in the sequence listing as SEQ ID 
NO. 6 (OspA--B31) , SEQ ID NO. 8 (OspA-K48), SEQ ID NO. 10 
(OspA-PGau) , and SEQ ID NO. 12 (OspA-25015) . An alignment 
of these and other OspA nucleic acid sequences is shown in 
30 Figure 42. The amino acid sequences of the proteins 

encoded by these nucleic acid sequences are represented as 
SEQ ID NO. 7 (OspA-B31) , SEQ ID NO. 9 (OspA-K48), SEQ ID 
NO. 11 (OspA-PGau) , and SEQ ID NO. 13 (OspA-25015) . 

The following primers were used to generate specific 
35 nucleic acid sequences of the OspA gene, to be used to 
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generate chimeric nucleic acid sequences (as described in 
Example 4) : 

5 ' -GTCTGCAAAAACCATGAGAAG- 3 ' (plus strand primer #369) (SEQ 
ID NO. 14) ; 

5 5 ' -GTCATCAACAGAAGAAAAATTC- 3 ' (plus strand primer #357) 

(SEQ ID NO 15) ; 
5 ' -CCGGATCCATATGAAAAAATATTTATTGGG-3 ' (plus strand primer 

#607) (SEQ ID NO. 16) ; 
5 ' -CCGGGATCCATATGGCTAAGCAAAATGTTAGC-3 ' (plus strand primer 

10 #584) (SEQ ID NO. 17) ; 

5 ' -GCGTTCAAGTACTCCAGA-3 ' (minus strand primer #200) (SEQ 

ID NO. 18) ; 

5 ' -GATATCTAGATCTTATTTTAAAGCGTT-3 ' (minus strand primer 

#586) (SEQ ID NO. 19) ; and 
15 5 ' -GGATCCGGTGACCTTTTAAAGCGTTTTTAAT- 3 ' (minus strand primer 

#1169) (SEQ ID NO. 20) . 

Cloning and Sequencing of OspB 

Similar methods were also used to isolate OspB genes. 
One OspB genes isolated is represented as SEQ ID NO. 21 
2 0 (OspB-B31) ; its encoded amino acid sequence is SEQ ID NO. 
22. 

The following primers were used to generate specific 
nucleic acid sequences of the OspB gene, to be used in 
generation of chimeric nucleic acid sequences (see Example 
25 4) : 

5 ' -GGTACAATTACAGTACAA-3 ' (plus strand primer #721) (SEQ ID 
NO. 23) ; 

5 ' - CCGAGAATCTCATATGGCACAAAAAGGTGCTGAGTCAATTGG - 3 ' (plus 

strand primer #1105) (SEQ ID NO. 24); 
30 5 ' -CCGATATCGGATCCTATTTTAAAGCGTTTTTAAGC-3 ' (minus strand 
primer # 1106) (SEQ ID NO. 25); and. 
5' -GGATCCGGTGACCTTTTAAAGCGTTTTTAAG-3 ' (minus strand primer 

#1170) (SEQ ID NO. 26) . 



BNSDOCID* <WO 951 2676A1 > 



WO 95/12676 PCTAJS94/ 12352 



-38- 

Cloning and Sequencing of OspC 

Similar methods were also used to isolate OspC genes. 
The following primers were used to isolate entire OspC 
genes from Borrelia strains B31, K4 8, PKO, and pTrob: 
5 5' -GTGCGCGACCATATGAAAAAGAATACATTAAGTGCG-3' (plus strand 
primer having Ndel site combined with start codon) (SEQ ID 
NO. 27) , and 

5 ' -GTCGGCGGATCCTTAAGGTTTTTTTGGACTTTCTGC-3 ' (minus strand 
primer having BamHl site followed by stop codon) (SEQ ID 
10 NO. 28) . 

The nucleic acid sequences of the OspC genes were then 
determined by the dideoxy chain-termination technique using 
the Sequenase kit (United States Biochemical) . OspC 
genes isolated and sequenced in this manner include those 
15 for strains B31, K4 8, PKo, and Tro; the nucleic acid 

sequences are depicted in the sequence listing as SEQ ID 
NO. 29 (OspC-B31) , SEQ ID NO. 31 (OspC-K48) , SEQ ID NO. 33 
(OspC-PKo) , and SEQ ID NO. 35 (OspC-Tro) . An alignment of 
these sequences is shown in Figure 38. The amino acid 
2 0 sequences of the proteins encoded by these nucleic acid 

sequences are represented as SEQ ID NO. 3 0 (OspC-B31) , SEQ 
ID NO. 32 (OspC-K48), SEQ ID NO. 34 (OspC -PKo) , and SEQ ID 
NO. 36 (OspC-Tro) . 

Truncated OspC genes were generated using other 
25 primers. These primers were designed to amplify nucleic 

acid sequences, derived from the OspC gene, that lacked the 
nucleic acids encoding the signal peptidase sequence of the 
full-length protein. The primers corresponded to bp 58-75 
of the natural protein, with a codon for Met -Ala attached 
3 0 ahead. For strain B31, the following primer was used: 
5 ' - GTGCGCGACCATATGGCTAATAATTCAGGGAAAGAT - 3 ' ( SEQ ID NO . 

37) . 

For strain PKo, 
5 ' - GTGCGCGACCATATGGCTAGTAATTCAGGGAAAGGT - 3 ' (SEQ ID NO . 3 8) 

3 5 was used. 
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For strains pTrob and K48, 
5 ' -GTGCGCGACCATATGGCTAATAATTCAGGTGGGGAT- 3 ' (SEQ ID NO. 39) 

was used. 

Additional primers were also designed to amplify 
5 nucleic acids encoding particular polypeptides, for use in 
creation of chimeric nucleic acid sequences (see Example 
4) . These primers included: 
5 ' -CTTGGAAAATTATTTGAA-3 ' (plus strand primer #520) (SEQ ID 

NO. 40) ; 

10 5' -CACGGTCACCCCATGGGAAATAATTCAGGGAAAGG-3 ' (plus strand 
primer #58) (SEQ ID NO. 41); 
5 ' -TATAGATGACAGCAACGC- 3 ' (minus strand primer #207) (SEQ 

ID NO. 42) ; and 
5 ' - CCGGTGACCCCATGGTACCAGGTTTTTTTGGACTTTCTGC - 3 ' (minus 

15 strand primer #636) (SEQ ID NO. 43). 

Cloning and Sequencing of OspD 

Similar methods can be used to isolate OspD genes. An 
alignment of four OspD nucleic acid sequences (from strains 
pBo, PGau, DK2 9, and K4 8) is shown in Figure 39. 

Cloning and Sequencing of pl2 

The pl2 gene was similarly identified. Primers used 
to clone the entire p!2 gene included: 5'- 

CCGGATCCATATGGTTAAAAAAATAATATTTATTTC-3 ' (forward primer # 
757) (SEQ ID NO. 44); and 5'- 

GATATCTAGATCTTTAATTGCTCTGCTCACTCTCTTC - 3 ' (reverse primer 
#758) (SEQ ID NO. 45) . 

To amplify a truncated pl2 gene (one in which the 
transcribed protein is non-lipidated, and begins at amino 
acid 18 of the native sequence) , the following primers were 
used : 5 ' - CCGGGATCCATATGGCTAGTGCAATTGGTCGTGG- 3 ' ( forward 
primer # 759) (SEQ ID NO. 46); and primer #758 (SEQ ID NO. 
45) . 



25 
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Cloning and Sequencing of p41 (fla) 

A similar approach was used to clone and sequence 
genes encoding the p41 (fla) protein. The p41 sequences 
listed in Table II with GenBank accession numbers were 
5 isolated using the following primers from strain B31: 

5 ' -ATGATTATCAATCATAAT- 3 ' {+1 to +18) (SEQ ID NO. 47); and 
5 ' -TCTGAACAATGACAAAAC-3 ' (+1008 to +991) (SEQ ID NO. 48). 
The nucleic acid sequences of p41 isolated in this manner 
are depicted in the sequence listing as SEQ ID NO. 51 (p41- 
10 PGau) , and SEQ ID NO. 53 (p41-DK29) . An alignment of 

several p41 nucleic acid sequences, including those for 
strains B31, pKal, PGau, pBo, DK29, and pKo, is shown in 
Figure 41. The amino acid sequences of the proteins 
encoded by these nucleic acid sequences are represented as 
15 SEQ ID NO. 50 (p41-K48) , SEQ ID NO. 52 (p41-PGau) , SEQ ID 
NO. 54 (p41-DK29), SEQ ID NO. 56 (p41-PTrob) , and SEQ ID 
NO. 58 (p41-PHei) . 

Other primers were designed to amplify nucleic acid 
sequences encoding polypeptides of p41, to be used in 
20 chimeric nucleic acid sequences. These primers included: 

5 ' -TTGGATCCGGTCACCCCATGGCTCAATATAACCAATG- 3 ' (minus strand 

primer #122) (SEQ ID NO. 59); 
5' -TTGGATCCGGTCACCCCATGGCTTCTCAAAATGTAAG-3 ' (plus strand 

primer # 140) (SEQ ID NO. 60); 

2 5 5' -TTGGATCCGGTGACCAACTCCGCCTTGAGAAGG-3 ' (minus strand 

primer # 234) (SEQ ID NO. 61); and 
5 ' -TTGGATCCGGTGACCTATTTGAGCATAAGATGC-3 ' (minus strand 

primer #141) (SEQ ID NO. 62) . 

Cloning and Sequencing of p93 

3 0 The same approach was also used to clone and sequence 

p93 protein. Genes encoding p93, as listed in Table II 
with GenBank accession numbers, were isolated by this 
method with the following primers from strain B31: 
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5 ' -GGTGAATTTAGTTGGTAAGG-3 ' (-54 to -35) (SEQ ID NO. 63); 
and 

5 ' -CACCAGTTTCTTTAAGCTGCTCCTGC-3 ' (+1117 to +1092) (SEQ ID 
NO. 64) . 

5 The nucleic acid sequences of p93 isolated in this 

manner are depicted in the sequence listing as SEQ ID NO. 
65 (p93-B31) , SEQ ID NO. 67 (p93-K48) SEQ ID NO. 69 (p93- " 
PBo) , SEQ ID NO. 71 (p93-PTrob), SEQ ID NO. 73 (p93-PGau>, 
SEQ ID NO. 75 (p93-25015), and SEQ ID NO. 77 (p93-PKo). 

10 The amino acid sequences of the proteins encoded by these 
nucleic acid sequences are represented as SEQ ID NO. 66 
(p93-B31), SEQ ID NO. 68 (p93-K48) SEQ ID NO. 70 (p93-PBo), 
SEQ ID NO. 72 (p93-PTrob) , SEQ ID NO. 74 (p93-PGau), SEQ ID 
NO. 76 (p93-25015), and SEQ ID NO. 78 (p93-PKo). 

15 Other primers were used to amplify nucleic acid 

sequences encoding polypeptides of p93 to be used in 
generating chimeric nucleic acid sequences. These primers 
included : 

5 ' -CCGGTCACCCCATGGCTGCTTTAAAGTCTTTA-3 ' (plus strand primer 

20 #475) (SEQ ID NO. 79); 

5 ' - CCGGTCACCCCATGAATCTTGATAAAGCTCAG- 3 ' (plus strand primer 

#900) (SEQ ID NO. 80) ; 
5 ' -CCGGTCACCCCATGGATGAAAAGCTTTTAAAAAGT-3 ' (plus strand 

primer #1168) (SEQ ID NO. 81); 
25 5 ' -CCGGTCACCCCCATGGTTGAGAAATTAGATAAG-3 ' (plus strand 
primer #1423) (SEQ ID NO. 82); and 

5 ' -TTGGATCCGGTGACCCTTAACTTTTTTTAAAG-3 ' (minus strand 
primer # 2100) (SEQ ID NO. 83) . 

C . Expression of Proteins from Borrelia Genes 
3 0 The nucleic acid sequences described above can be 

incorporated into expression plasmids, using standard 
techniques, and transfected into compatible host cells in 
order to express the proteins encoded by the nucleic acid 
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sequences . 



As an example, the expression the pl2 gene and 



the isolation of pl2 protein is set forth. 

Amplification of the pl2 nucleic acid sequence was 
conducted with primers that included a Ndel restriction 
5 site into the nucleic acid sequence. The PCR product was 
extracted with phenol/chloroform and precipitated with 
ethanol. The precipitated product was digested and ligated 
into an expression plasmid as follows: 15 /xl 
(approximately 1 /xg) of PCR DNA was combined with 2 /il 10X 
10 restriction buffer for Ndel (Gibco/BRL) , 1 fJ.1 Ndel 

(Gibco/BRL) , and 2 jzl distilled water, and incubated 
overnight at 37°C. This mixture was subsequently combined 
with 3 fil 10X buffer (buffer 3, New England BioLabs) , 1 fil 
BamHI (NEB), and 6 pi distilled water, and incubated at 37° 
15 for two hours. The resultant material was purified by 
preparative gel electrophoresis using low melting point 
agarose, and the band was visualized under long wave 
ultraviolet light and excised from the gel. The gel slice 
was treated with Gelase using conditions recommended by the 
20 manufacturer (Epicentre Technologies) . The resulting DNA 
pelled was resuspended in 25-50 /il of 10 mM TRIS-CL (pH 
8.0) and 1 mM EDTA (TE) . An aliquot of this material was 
ligated into the Pet9c expression vector (Dunn, J. J. et 
a!., Protein Expression and Purifi cation 1: 159 (1990)). 
25 To ligate the material into the Pet9c expression 

vector, 20-50 ng of pl2 nucleic acid sequences cut and 
purified as described above was combined with 5 /il 10 One- 
Phor-All (OPA) buffer (Pharmacia) , 30-60 ng Pet9c cut with 
Ndel and BamHI, 2.5 fil 2 0 mM ATP, 2 fil T4 DNA ligase 
30 (Pharmacia) diluted 1:5 in IX OPA buffer, and sufficient 
"distilled water to bring the final volume to 50 /zl . The 
mixture was incubated at 12 °C overnight. 

The resultant ligations were transformed into 
competent DH5 -alpha cells and plated on nutrient agar 
35 plates containing 50 fig/ml kanamycin and incubated 



WO 95/12676 PCTAJS94/12352 



-43- 



10 



overnight at 37 °C. DH5-alpha is used as a "storage 
strain" for T7 expression clones, because it is RecA 
deficient, so that recombination and concatenation are not 
problematic, and because it lacks the T7 RNA polymerase 
gene necessary to express the cloned gene. The use of this 
strain allows for cloning of potentially toxic gene 
products while minimizing the chance of deletion and/or 
rearrangement of the desired genes. Other cell lines 
having similar properties may also be used. 

Kanamycin resistant colonies were single -colony 
purified on nutrient agar plates supplemented with 
kanamycin at 50 ^g/ml . A colony from each isolate was 
inoculated into 3-5 ml of liquid medium containing 50 iig/ml 
kanamycin, and incubated at 37°C without agitation. 
15 Plasmid DNA was obtained from 1 ml of each isolate using a 
hot alkaline lysis procedure (Mantiatis, T. et al . , 
Molecular Clonina: A Laboratory Manual , cold Spring Harbor 
Laboratory, Cold Spring Harbor, NY (1982)). 

Plasmid DNA was digested with EcoRI and Bglll in the 
20 following manner: 15 Ail plasmid DNA was combined with 2 (il 
10X buffer 3 (NEB) , 1 M EcoRI (NEB) , 1 /xl Bglll (NEB) and 1 
/il distilled water, and incubated for two hours at 37°C. 
The entire reaction mixture was electrophoresed on an 
analytical agarose gel. Plasmids carrying the pl2 insert 
25 were identified by the presence of a band corresponding to 
925 base-pairs (full length pl2) or 875 base-pairs 
(nonlipidated pl2) . 

One or two plasmid DNAs from the full length and 
nonlipidated pl2 clones in Pet9c were used to transform 

3 0 BL21 DE3 pLysS to kanamycin resistance as described by 

Studier et al . ( Methods in Enzvmoloctv . Goeddel , D. (Ed.), 
Academic Press, 185: 60-89 (1990)). One or two 
transformants of the full length and nonlipidated clones 
were single -colony purified on nutrient plates containing 

35 25 ng/ml chloramphenicol (to maintain pLysS) and 50 fig/ml 
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kanamycin at 3 7 °C. One colony of each isolate was 
inoculated into liquid medium supplemented with 
chloramphenicol and kanamycin and incubated overnight at 
37°C. The overnight culture was subcultured the following 
5 morning into 500 ml of liquid broth with chloramphenicol 

(25 /ig/ml) and kanamycin (50 /xg/ml) ;and grown with aeration 
at 3 7°C in an orbital air-shaker until the absorbance at 
600 nm reached 0.4-0.7. Isopropyl- thio-galactoside (IPTG) 
was added to a final concentration of 0.5 mM, for 
10 induction, and the culture was incubated for 3-4 hours at 
37° as before. The induced cells were pelleted by 
centrifugation and resuspended in 2 5 ml of 2 0 mM NaP0 4 (pH 
7.7) . A small aliquot was removed for analysis by gel 
electrophoresis. Expressing clones produced proteins which 
15 migrated at the 12 kDa position. 

A crude cell lysate was prepared from the culture as 
described for recombinant OspA by Dunn, J.J. et al . , 
( Protein Expression and Purification 1 : 159 (1990) ) . The 
crude lysate was first passed over a Q-sepharose column 
20 (Pharmacia) which had been pre-equilibrated in Buffer A: 

10 mM NaP0 4 (pH 7.7), 10 mM NaCl , 0.5 mM PMSF. The column 
was washed with 10 mM NaP0 4 , 50 mM NaCl and 0 . 5 mM PMSF and 
then pl2 was eluted in 10 mM NaP0 4/ 0 . 5 mM PMSF with a NaCl 
gradient from 50-400 mM. pl2 eluted approximately halfway 
25 through the gradient between 100 and 200 mM NaCl. The peak 
fractions were pooled and dialyzed against 10 mM NaPo4 (pH 
7.7), 10 mM NaCl, 0 . 5 mM PMSF. The protein was then 
concentrated and applied to a Sephadex G50 gel filtration 
column of approximately 5 0 ml bed volume (Pharmacia) , in 10 
30 mM NaP0 4 , 200 mM NaCl, 0.5 mM PMSF. p!2 would typically 
elute shortly after the excluded volume marker. Peak 
fractions were determined by running small aliquots of all 
fractions on a gel. The pl2 peak was pooled and stored in 
small aliquots at -20°C. 
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Example 4 . 



Generation of Chimeric Nucl eic Acid 
Sequences and Chimer ic Proteins 



10 



15 



General Protocol for Creation of Chimeric Nucleic Acid 
Sequences 

The megaprimer method of site directed mutagenesis and 
its modification were used to generate chimeric nucleic 
acid sequences (Sarkar and Sommer, Biotechniques 8(4) : 404- 
407 (1990) ; Aiyar , A. and J. Leis, Biotechniques 14(3): 
366-369 (1993)). A 5' primer for the first genomic 
template and a 3' fusion oligo are used to amplify the 
desired region. the fusion primer consists of a 3-' end of 
the first template (DNA that encodes the amino -proximal 
polypeptide of the fusion protein), coupled to a 5 ' end of 
the second template (DNA that encodes the carboxy-proximal 
polypeptide of the fusion protein) . 

The PCR amplifications are performed using Taq DNA 
polymerase, 10X PCR buffer, and MgCl 2 (Promega Corp., 
Madison, WI) , and Ultrapure dNTPs (Pharmacia, Piscataway, 
NJ) . One fig of genomic template 1, 5 /x of 10 /xM 5' oligo 
and 5 fil of 10 m m fusion oligo are combined with the 
following reagents at indicated final concentrations: 10X 
Buffer-Mg FREE (IX), MgCl 2 (2 mM) , dNTP mix (200 fiM each 
dNTP) , Taq DNA polymerase (2.5 units), water to bring final 
volume to 100 /xl . A Thermal Cycler (Perkin Elmer Cetus, 
Norwalk, CT) is used to amplify under the following 
conditions: 35 cycles at 95°C for one minute, 55°C for two 
minutes, and 72° for three minutes. This procedure results 
in a "megaprimer" . 

The resulting megaprimer is run on a IX TAE, 4% low- 
melt agarose gel . The megaprimer band is cut from the gel 
and purified using the Promega Magic PCR Preps DNA 
purification system. Purified megaprimer is then used in a 
second PCR step. One /zg of genomic template 2, 
approximately 0.5 % /zg of the megaprimer, and 5 \i of 10 jzM 3' 
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oligo are added to a cocktail of 10X buffer, MgCl 2/ dNTPs 
and Taq at the same final concentrations as noted above, 
and brought to 10 0 /xl with water. PCR conditions are the 
same as above. The fusion product resulting from this 
5 amplification is also purified using the Promega Magic PCR 
Preps DNA purification system. 

The fusion product is then ligated into TA vector and 
transformed into E. coli using the Invitrogen (San Diego, 
CA) TA Cloning Kit. Approximately 50 ng of PCR fusion 

10 product is ligated to 50 ng of pCRII vector with IX 

Ligation Buffer, 4 units of T4 ligase, and brought to 10 Nl 
with water. This ligated product mixture is incubated at 
12 °C overnight (approximately 14 hours) . Two /xl of the 
ligation product mixture is added to 50 /xl competent INC F' 

15 cells and 2 /i beta mercaptoethanol . The cells are then 

incubated for 3 0 minutes, followed by heat shock treatment 
at 42 °C for 6 0 seconds, and an ice quenching for two 
minutes. 450 /xl of warmed SOC media is then added to the 
cells, resulting in a transformed cell culture which is 

2 0 incubated at 37°C for one hour with slight shaking. 50 /xl 

of the transformed cell culture is plated on. LB + 50 fig /pi 
ampicillin plates and incubated overnight at 37°C. Single 
white colonies are picked and added to individual overnight 
cultures containing 3 ml LB with ampicillin (50 i^g/fil) - 
25 The individual overnight cultures are prepared using 

Promega' s Magic Miniprep DNA purification system. A small 
amount of the resulting DNA is cut using a restriction 
digest as a check. DNA sequencing is then performed to 
check the sequence of the fusion nucleic acid sequence, 

3 0 using the United States Biochemical (Cleveland, OH) 

Sequenase Version 2.0 DNA sequencing kit. Three to five /xg 
of plasmid DNA is used per reaction. 2 ill 2M NaOH/2mM EDTA 
are added to the DNA, and the volume is brought to 20 /xl 
with water. The mixture is then incubated at room 
35 temperature for five minutes. 7 /xl water, 3/xl 3M NaAc, 75 
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lil EtOH are added. The resultant mixture is mixed by 
vortex and incubated for ten minutes at -70°C, and then 
subjected to microf ugation . After microfuge for ten 
minutes, the supernatant is aspirated off, and the pellet 
5 is dried in the speed vac for 30 second. 6 ptl water, 2 /il 
annealing buffer, and 2 /il of 10 /iM.of the appropriate 
oligo is then added. This mixture is incubated for 10 
minutes at 37°C and then allowed to stand at room 
temperature for 10 minutes. Subsequently, 5.5 /il of label 

10 cocktail (described above) is added to each sample of the 
mixture, which are incubated at room temperature for an 
additional five minutes. 3.5 ^1 labeled DNA is then added 
to each sample which is then incubated for five minutes at 
37°C. 4 pi stop solution is added to each well. The DNA 

15 is denatured at 95° for two minutes, and then placed on. 
ice . 

Clones with the desired fusion nucleic acid sequences 
are then recloned in frame in the pEt expression system in 
the lipidated (full length) and non-lipidated (truncate?!, 
20 i.e., without first 17 amino acids) forms. The product is 
amplified using restriction sites contained in the PCR 
primers. The vector and product are cut with the same 
enzymes and ligated together with T4 ligase. The resultant 
plasmid is transformed into competent E . coli using 
25 standard transformation techniques. Colonies are screened 
as described earlier and positive clones are transformed 
into expression cells, such as E. coli BL21, for protein 
expression with IPTG for induction. The expressed protein 
in its bacterial culture lysate form and/or purified form 
3 0 is then injected in mice for antibody production. The mice 
are bled, and the sera collected for agglutination, in 
vitro growth inhibition, and complement- dependent and - 
independent lysis tests . 
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B . Specific Chimeric Nucleic Aci d Sequences 

Various chimeric nucleic acid sequences were 
generated. The nucleic acid sequences are described as 
encoding polypeptides from Borrelia proteins. The chimeric 
5 nucleic acid sequences are produced such that the nucleic 
acid sequence encoding one polypeptide is in the same 
reading frame as the nucleic acid sequence encoding the 
next polypeptide in the chimeric protein sequence encoded 
by the chimeric nucleic acid sequence. The proteins are 

10 listed sequentially (in order of presence of the encoding 
sequence) in the description of the chimeric nucleic acid 
sequence. For example, if a chimeric nucleic acid sequence 
consists of bp 1-650 from OspA-1 and bp 651-820 from OspA-2 
were sequenced, the sequence of the chimer would include 

15 the first 650 base pairs from OspA-1 followed immediately 
by base pairs 651-820 of OspA-2. 

OspA-K4 8/Osr>A-PGau A chimer of OspA from strain 
K48 (OspA-K48) and OspA from strain PGau (OspA-PGau) was 
generated using the method described above. This chimeric 

20 nucleic acid sequence included bp 1-654 from OspA-K48, 
followed by bp 655-820 from OspA-PGau. Primers used 
included: the amino- terminal sequence of OspA primer #60 7 
(SEQ ID NO. 16) ; the fusion primer, 
■5' - AAAGTAGAAGTTTTTGAATCCCATTTTCCAGTTTTTTT- 3 ' (minus strand 

25 primer #668-654) (SEQ ID NO. 84); the carboxy- terminal 
sequence of OspA primer #586 (SEQ ID NO. 19) ; and the 
sequence primers #369 (SEQ ID NO. 14) and #357 (SEQ ID NO. 
15) . The chimeric nucleic acid sequence is presented as 
SEQ ID NO. 85; the chimeric protein encoded by this 

3 0 chimeric nucleic acid sequence is presented as SEQ ID NO. 
86 . 

OspA-B3l/OspA-PGau A chimer of OspA from strain B31 (OspA- 
B31) and OspA from strain PGau (OspA-PGau) was generated 
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using the method described above. This chimeric nucleic 
acid sequence included bp 1-651 from 0spA-B31, followed by 
bp 6 52-820 from OspA-PGau. Primers used included: the 
fusion primer, 

5 5' -AAAGTAGAAGTTTTTGAATTCCAAGCTGCAGTTTT-3 ' (minus strand 
primer #668-651) (SEQ ID NO. 87); and the sequence primer, 
#36 9 (SEQ ID NO. 14) . The chimeric nucleic acid sequence 
is presented as SEQ ID NO. 88; the chimeric protein encoded 
by this chimeric nucleic acid sequence is presented as SEQ 
10 ID NO. 89. 

OspA-B3 l /OspA-K4 8 A chimer of OspA from strain B31 (OspA- 
B31) and OspA from strain K4 8 (OspA-K48) was generated 
using the method described above. This chimeric nucleic, 
acid sequence included bp 1-651 from OspA-B31, followed by 
15 bp 652-820 from OspA-K48. Primers used included: the 
fusion primer, 

5' - AAAGTGGAAGTTTTTGAATTCCAAGCTGCAGTTTTTTT - 3 ' (minus strand 
primer #671-651) (SEQ ID NO. 90); and the sequence primer, 
#3 6 9 (SEQ ID NO. 14) . The chimeric nucleic acid sequence 
20 is presented as SEQ ID NO. 91; the chimeric protein encoded 
by this chimeric nucleic acid sequence is presented as SEQ 
ID NO. 92 . 

QspA-B31/QsdA- 2 5 015 A chimer of OspA from strain B31 (OspA- 
B31) and OspA from strain 25015 (OspA-25015) was generated 

25 using the method described above. This chimeric nucleic 

acid sequence included bp 1-651 from OspA-B31, followed by 
bp 652-820 from OspA-25015. Primers used included: the 
fusion primer , 5 ' - TAAAGTTGAAGTGCCTGCATTCCAAGCTGCAGTTT - 3 ' 
(SEQ ID NO. 93) . The chimeric nucleic acid sequence is 

30 presented as SEQ ID NO. 94; the chimeric protein encoded by 
this chimeric nucleic acid sequence is presented as SEQ ID 
NO. 95. 
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OspA-K4 8/Qsr>A-B3l/OspA-K48 A chimer of OspA from strain 

B31 (OspA-B31) and OspA from strain K48 (OspA-K48) was 
generated using the method described above. This chimeric 
nucleic acid sequence included bp 1-570 from OspA-B31, 
5 followed by bp 570-651 from OspA-B31, followed by bp 650- 

820 from OspA-K48. Primers used included: the fusion 
primer, 5 ' -CCCCAGATTTTGAAATCTTGCTTAAAACAAC-3 ' (SEQ ID NO. 

96); and the sequence primer, #357 (SEQ ID NO. 15). The 

chimeric nucleic acid sequence is presented as SEQ ID NO. 
10 97; the chimeric protein encoded by this chimeric nucleic 

acid sequence is presented as SEQ ID NO. 98. 

OHoA-B3l/OsDA-K48/OsnA-B3l/OspA-K48 A chimer of OspA 

from strain B31 (OspA-B31) and OspA from strain K4 8 (OspA- 
K48) was generated using the method described above. This 

15 chimeric nucleic acid sequence included bp 1-420 from OspA- 
B31, followed by 420-570 from OspA-K48, followed by bp 570- 
650 from OspA-B31, followed by bp 651-820 from OspA-K48. 
Primers used included: the fusion primer, 5'- 
CAAGTCTGGTTCCAATTTGCTCTTGTTATTAT-3 ' (minus strand primer 

20 #436-420) (SEQ ID NO. 99) ; and the sequence primer, #357 
(SEQ ID NO. 15) . The chimeric nucleic acid sequence is 
presented as SEQ ID NO. 10 0; the chimeric protein encoded 
by this chimeric nucleic acid sequence is presented as SEQ 
ID NO. 101. 

25 QspA-B31 /OSDB-B31 A chimer of OspA and OspB from strain 
B31 (OspA-B31, OspB-B31) was generated using the method 
described above. The chimeric nucleic acid sequence 
included bp 1-651 from OspA-B31, followed by bp 652-820 
from OspB-B31. Primers used included: the fusion primer, 

3 0 5 ' - GTTAAAGTGCTAGTACTGTCATTCCAAGCTGCAGTTTTTTT - 3 ' (minus 
strand primer #740-651) (SEQ ID NO. 102); the carboxy- 
terminal sequence of OspB primer #110 6 (SEQ ID NO. 25) ; and 
the sequence primer #357 (SEQ ID NO. 15) . The chimeric 



WO 95/12676 PCT7US94/ 12352 



-51- 

nucleic acid sequence is presented as SEQ ID NO. 103; the 
chimeric protein encoded by this chimeric nucleic acid 
sequence is -presented as SEQ ID NO. 104. 

OspA-B31/OspB-B3l/OspC-B31 A chimer of OspA, OspB and 

5 OspC from strain B31 (OspA-B31, OspB-B31, and OspC-B31) was 
generated using the method described above. The chimeric 
nucleic acid sequence included bp 1-650 from OspA-B31, 
followed by bp 652-820 from OspB-B31, followed by bp 74-630 
of OspC-B31. Primers used included: the fusion primer, 5'- 

10 TGCAGATGTAATCCCATCCGCCATTTTTAAAGCGTTTTT- 3 ' (SEQ ID NO. 

105) ; and the carboxy- terminal sequence of OspC primer (SEQ 
ID NO. 28) . The chimeric nucleic acid sequence is 
presented as SEQ ID NO. 106; the chimeric protein encoded 
by this chimeric nucleic acid sequence is presented as SEQ 

15 ID NO. 107. 

OSPC-B3 1 /OSPA-B3 1 /OspB-B3 1 A chimer of OspA, OspB and t 

OspC from strain B31 (OspA-B31, OspB-B31, and OspC -B31), was 
generated using the method described above. The chimeric 

20 nucleic acid sequence included bp 1-630 from OspC-B31, 

followed by bp 52-650 from OspA-B31, followed by bp 650-820 
of OspB-B31. Primers used included: the amino- terminal 
sequence of OspC primer having SEQ ID NO. 27; the fusion 
primer , 5 ' - GCTGCTAACATTTTGCTTAGGTTTTTTTGGACTTTC - 3 ' (minus 

25 strand primer #69-630) (SEQ ID NO. 108); and the sequence 

primers #520 \SEQ ID NO. 40) and #200 (SEQ ID NO. 18) . The 
chimeric nucleic acid sequence is presented as SEQ ID NO. 
109; the chimeric protein encoded by this chimeric nucleic 
acid sequence is presented as SEQ ID NO. 110. 

3 0 Additional Chimeric Nucleic Acid Sequences 

Using the methods described above, other chimeric 
nucleic acid sequences were produced. These chimeric 
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' nucleic acid sequences, and the proteins encoded, are 
summarized in Table 3 . 

Table III Chimeric Nucleic acid Sequences and the Encoded 



Proteins 



Chimers Generated (base pairs) 


NO. (nt) 


(protein) • 


OspA (52-882) / p93 (1168-2100) 


111 


112 


OspB (45-891) / p41 (122-234) 


113 


114 


OspB (45-891) / p41 (122-295) 


115 


116 


OspB (45-891) / p41 (140-234) 


117 


118 


OspB (45-891) / p41 (140-295) 


119 


120 


OspB (45-891) / p41 (122-234) / 
OspC (58-633) 


121 


122 


OspA-Tro/OspA-Bo 


137 


138 


OspA-PGau/OspA-Bo 


139 


140 


OspA- B3 1 /OspA- PGau/OspA-B3 1 / 
OspA-K4 8 


141 


142 


OspA-PGau/OspA-B3l/OspA-K4 8 


14 3 


144 



C. Purification of Proteins Generated bv Chim eric Nucleic 
Acid Sequences 

The chimeric nucleic acid sequences described above, 
as well as chimeric nucleic acid sequences produced by the 
5 methods described above, are used to produce chimeric 

proteins encoded by the nucleic acid sequences. Standard 
methods, such as those described above in Example 3, 
concerning the expression of proteins from Borrella. genes, 
can be used' to express the proteins in a compatible host 
10 organism. The chimeric proteins can then be isolated and 
purified using standard techniques. 

If the chimeric protein is soluble, it can be purified 
on a Sepharose column. Insoluble proteins can be 
solubilized in guanidine and purified on a Ni++ column; 
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alternatively, they can be solubilized in 10 mM NaP0 4 with 
0.1 - 1% TRIXON X 114, and subsequently purified over an S 
column (Pharmacia) . Lipidated proteins were generally 
purified by the latter method. Solubility was determined 
5 by separating both soluble and insoluble fractions of cell 
lysate on a 12% PAGE gel, and checking for the localization 
of the protein by Coomasie staining, or by Western blotting 
with monoclonal antibodies directed to an antigenic 
polypeptide of the chimeric protein. 



10 Equivalents 

Those skilled in the art will recognize, or be able to 
ascertain using no more than routine experimentation, many 
equivalents to the specific embodiments of the invention 
described herein. such equivalents are intended to be 

15 encompassed in the scope of the following claims. 



BNSDOCID <WO 9512676A1> 



WO 95/12676 W W PCT/US94/12352 



-54- 
CLAIMS 

What is claimed is : 

1. A chimeric protein comprising two or more antigenic 
Borrelia polypeptides, wherein the antigenic Borrelia 

5 polypeptides which comprise the chimeric protein do 

not occur naturally in the same protein in Borrelia. 

2. The chimeric protein of Claim 1, wherein the antigenic 
Borrelia polypeptides are from two or more different 
species of Borrelia. 

10 3. The chimeric protein of Claim 2, wherein the antigenic 
Borrelia polypeptides are derived from Borrelia 
proteins selected from the group consisting of: outer 
surface protein A, outer surface protein B, outer 
surface protein C, outer surface protein D, pl2 , p3 9, 

15 p41, p66, and p93 . 

4. The chimeric protein of ..Claim 3, wherein the antigenic 
Borrelia polypeptides are from corresponding proteins 
from two or more different species of Borrelia. 

5. The chimeric protein of Claim 3, wherein the antigenic 
20 Borrelia polypeptides are from non- corresponding 

proteins from at least two different species of 
Borrelia . 

6. The chimeric protein of Claim 1, wherein two or more 
antigenic Borrelia polypeptides are from the same 

25 species of Borrelia. 
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7. The chimeric protein of Claim 6, wherein the antigenic 
Borrelia polypeptides are derived from Borrelia 
proteins selected from the group consisting of: outer 
surface protein A, outer surface protein B, outer 

5 surface protein C, outer surface protein D, pl2, p3 9, 

p41 , p66 , and p93 - 

8. The chimeric protein of Claim 7, wherein the antigenic 
Borrelia polypeptides are from the same protein. 

9. The chimeric protein of Claim 6, wherein the antigenic 
LO Borrelia polypeptides are from different proteins. 

10. A chimeric protein comprising two antigenic Borrelia 
polypeptides flanking a tryptophan residue, wherein 
the amino-proximal polypeptide consists of a 
polypeptide that is proximal from the single 

15 tryptophan residue of a first outer surface protein of 

Borrelia, and the carboxy-proximal polypeptide 
consists of a polypeptide that is distal from the 
single tryptophan residue of a second outer surface 
protein of Borrelia. 

20 11. The chimeric protein of Claim 10, wherein the first 
and second outer surface proteins are from the same 
species of Borrelia. 

12. The chimeric protein of Claim 11, wherein the first 
outer surface protein is outer surface protein A and 

25 the second outer surface protein is outer surface 

protein B. 

13. The chimeric protein of Claim 11, wherein the first 
outer surface protein is outer surface protein B, and 
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the second outer surface protein is outer surface 
protein A. 

14. The chimeric protein of Claim 10, wherein the first 
and second outer surface proteins are from different 

5 species of Borrelia. 

15. The chimeric protein of Claim 14, wherein the first 
outer surface protein is outer surface protein A and 
the second outer surface protein is outer surface 
protein B. 

10 16. The chimeric protein of Claim 14, wherein the first 

outer surface protein is outer surface protein B, and 
the second outer surface protein is outer surface 
protein A. 

17. The chimeric protein of Claim 14, wherein the first 
15 and second outer surface proteins are corresponding 

proteins selected from the group consisting of : outer 
surface protein A and outer surface protein B. 

18. The chimeric protein of Claim 10, wherein the first 
outer surface protein is outer surface protein A and 

20 the second outer surface protein is outer surface 

protein B. 

19. The chimeric protein of Claim 18, wherein the amino- 
proximal polypeptide further comprises a first, 
second, and third hypervariable domain, the first 

25 hypervariable domain consisting of residues 120 

through 14 0 of outer surface protein A, the second 
hypervariable domain consisting of residues 15 0 
through 18 0 of outer surface protein A, and the third 
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hypervariable domain consisting of residues 200 



10 



15 



through 217 of outer surface protein A. 

20. The chimeric protein of Claim 19, wherein the first 
and second hypervariable domains are derived from 
outer surface protein A from different species of 
Borrelia. 

21. The chimeric protein of Claim 10, further comprising 
an antigenic Borrelia polypeptide derived from a 
Borrelia protein selected from the group consisting 
of: outer surface protein A, outer surface protein B, 
outer surface protein C, outer surface protein D, p!2 , 
p3 9, p41, p66, and p93 . 

22. A nucleic acid sequence encoding a chimeric protein 
comprising two antigenic Borrelia polypeptides, 
wherein the two antigenic Borrelia polypeptides which 
comprise the chimeric protein do not occur naturally 
in the same protein in Borrelia. 

23. The nucleic acid sequence of Claim 22, wherein the 
antigenic Borrelia polypeptides are from two or more 
different species of Borrelia. 

24. The nucleic acid sequence of Claim 23, wherein the 
antigenic Borrelia polypeptides are derived from 
Borrelia proteins selected from the group consisting 
of: outer surface protein A, outer surface protein B, 
outer' surface protein C, outer surface protein D, p!2, 
p39, p41, p66, and p93 . 

25. The nucleic acid sequence of Claim 24, wherein the 



antigenic Borrelia polypeptides are from corresponding 
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proteins from two or more different species of 



I 

5 



10 



Borrelia. 

26. The nucleic acid sequence of Claim 24, wherein two or 
more of the antigenic Borrelia polypeptides are from 
non- corresponding proteins from different species of 
Borrelia. 

27. The nucleic acid sequence of Claim 22, wherein two or 
more antigenic Borrelia polypeptides are from the same 
species of Borrelia. 

28. The nucleic acid sequence of Claim 27, wherein the 
antigenic Borrelia polypeptides are derived from 
Borrelia proteins selected from the group consisting 
of: outer surface protein A, outer surface protein B, 
outer surface protein C, outer surface protein D, pl2 , 
p39, p41, p66, and p93 . 

29. The nucleic acid sequence of Claim 28, wherein the 
antigenic Borrelia polypeptides are from the same 
protein. 

30. The nucleic acid sequence of Claim 27, wherein the 
antigenic Borrelia polypeptides are from different 
proteins . 
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31. A nucleic acid sequence encoding a chimeric protein 
comprising two antigenic Borrelia polypeptides 
flanking a tryptophan residue, wherein the amino- 
proximal polypeptide consists of a polypeptide that is 
5 proximal from the single tryptophan residue of a first 

outer surface protein of Borrelia, and the carboxy- 
proximal polypeptide consists of a polypeptide that is 
distal from the single tryptophan residue of a second 
outer surface protein of Borrelia. 

10 32. The nucleic acid sequence of Claim 31, wherein the 

first and second outer surface proteins are from the 
same species of Borrelia. 

33. The nucleic acid sequence of Claim 32, wherein the 
first outer surface protein is outer surface protein A 

15 and the second outer surface protein is outer surface 

protein B. 

34. The nucleic acid sequence of Claim 32, wherein the 
first outer surface protein is outer surface protein 
B, and the second outer surface protein is outer 

20 surface protein A. 

35. The nucleic acid sequence of Claim 31, wherein the 
first and second outer surface proteins are from 
different species of - Borrelia. 

36. The nucleic acid sequence of Claim 35, wherein the 

25 first outer surface protein is outer surface protein A 

and the second outer surface protein is outer surface 
protein B. 
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37. The nucleic acid sequence of Claim 35, wherein the 

first outer surface protein is outer surface protein 
B, and the second outer surface protein is outer 
surface protein A. 

5 38. The nucleic acid sequence of Claim 35, wherein the 
first and second outer surface proteins are 
corresponding proteins selected from the group 
consisting of: outer surface protein A and outer 
surface protein B. 

10 39. The nucleic acid sequence of Claim 31, wherein the 

first outer surface protein is outer surface protein A 
and the second outer surface protein is outer surface 
protein B. 

40. The nucleic acid sequence of Claim 39, wherein the 
15 amino -proximal polypeptide further comprises a first 

and a second hypervariable domain, the first 
hypervariable domain consisting of amino acid residues 
1 through c 14 0 of outer surface protein A, and the 
second hypervariable domain consisting of amino acid 
20 residues 150 through 217 of outer surface protein A. 

41. The nucleic acid sequence of Claim 40, wherein the 
first and second hypervariable domains are derived 
from outer surface protein A from different species of 
Borrelia . 

25 42. The nucleic acid sequence of Claim 31, further 

comprising an antigenic Borrelia polypeptide derived 
from a Borrelia protein selected from the group 
consisting of: outer surface protein A, outer surface 
protein B, outer surface protein C, outer surface 

30 protein D, pl2 , p39, p41, p66, and p93 . 
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43. A nucleic acid sequence having a sequence selected 
from the group consisting of: SEQ ID NO. 85, SEQ ID 
NO. 88, SEQ ID NO. 91, SEQ ID NO. 94, SEQ ID NO. 97, 
SEQ ID NO. 100, SEQ ID NO. 103, SEQ ID NO. 106, SEQ ID 

5 NO. 109, SEQ ID NO. Ill, SEQ ID NO. 113, SEQ ID NO. 

115, SEQ ID NO. 117, SEQ ID NO/ 119, SEQ ID NO . 121, . 
SEQ ID NO. 137, SEQ ID NO . 13 9, SEQ ID NO. 141, and 
SEQ ID NO. 143. 

44. A protein having an amino acid sequence selected from 
10 the group consisting of: SEQ ID NO. 86, SEQ ID NO. 

89, SEQ ID NO. 92, SEQ ID NO. 95, SEQ ID NO. 98, SEQ 
ID NO. 101, SEQ ID NO. 104, SEQ ID NO. 107, SEQ ID NO. 
110, SEQ ID NO. 112, SEQ ID NO. 114 , SEQ ID NO. 116, 
SEQ ID NO. 118, SEQ ID NO. 12 0, SEQ ID NO. 122, SEQ ID 
15 NO. 138, SEQ ID NO. 140, SEQ ID NO. 142, and SEQ ID 

NO. 144. 

45. A chimeric protein according to any one of claims 1 to 
21 and 44 for use in therapy or diagnosis, for example 
as a vaccine against Borrelia infection, in 

20 immunodiagnostic assays to detect the presence of 

antibodies to Borrelia or to measure T-cell 
reactivity. 

46. A chimeric protein according to claim 45, wherein the 
immunodiagnostic assay is a dot blot, Western blot, . 

2 5 ELISA or agglutination assay. 
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47. Use of the chimeric protein according to any one of 
claims 1 to 21 and 44, or the nucleic acid sequence of 
any one of claims 22 to 43, for the manufacture of a 
compound for use in therapy or diagnosis, for example 

5 as a vaccine against Borrelia infection, in 

immunodiagnostic assays to detect the presence of 
antibodies to Borrelia or to measure T-cell 
reactivity. 

48. Use according to claim 47, wherein the 

10 immunodiagnostic assay is a dot blot, Western blot, 

ELISA or agglutination assay- 
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ATG AAA AAA TAT TTA TTG GGA ATA GGT CTA ATA TTA GCC TTA ATA GCA 48 
Met Lys Lys Tyr Leu Leu Gly lie Gly Leu He Leu Ala Leu He Ala 
15 10 15 

TGT AAG CAA AAT GTT AGC AGC CTT GAC GAG AAA AAC AGC GTT TCA GTA 96 
Cys Lys Gin Asn Val Ser Ser Leu Asp Glu Lys Asn Ser Val Ser Val 
20 25 30 

GAT TTG CCT GGT GAA ATG AAA GTT CTT GTA AGC AAA GAA AAA AAC AAA 144 
Asp Leu Pro Gly Glu Met Lys Val Leu Val Ser Lys Glu Lys Asn Lys 
35 40 45 

GAC GGC AAG TAC GAT CTA ATT GCA ACA GTA GAC AAG CTT GAG CTT AAA 192 
Asp Gly Lys Tyr Asp Leu He Ala Thr Val Asp Lys Leu Glu Leu Lys 
50 55 60 

GGA ACT TCT GAT AAA AAC AAT GGA TCT GGA GTA CTT GAA GGC GTA AAA 240 
Gly Thr Ser Asp Lys Asn Asn Gly Ser Gly Val Leu Glu Gly Val Lys 
65 70 75 80 

GCT GAC AAA AGT AAA GTA AAA TTA ACA ATT TCT GAC GAT CTA GGT CAA 288 
Ala Asp Lys Ser Lys Val Lys Leu Thr lie Ser Asp Asp Leu Gly Gin 
85 90 95 

ACC ACA CTT GAA GTT TTC AAA GAA GAT GGC AAA ACA CTA GTA TCA AAA 336 
Thr Thr Leu Glu Val Phe Lys Glu Asp Gly Lys Thr Leu Val Ser Lys 
100 105 110 

AAA GTA ACT TCC AAA GAC AAG TCA TCA ACA GAA GAA AAA TTC AAT GAA 384 
Lys Val Thr Ser Lys Asp Lys Ser Ser Thr Glu Glu Lys Phe Asn Glu 
115 120 125 

AAA GGT GAA GTA TCT GAA AAA ATA ATA ACA AG A GCA GAC GGA ACC AG A 432 
Lys Gly Glu Val Ser Glu Lys He He Thr Arg Ala Asp Gly Thr Arg 
130 135 140 

CTT GAA TAC ACA GGA ATT AAA AGC GAT GGA TCT GGA AAA GCT AAA GAG 480 
Leu Glu Tyr Thr Gly He Lys Ser Asp Gly Ser Gly Lys Ala Lys Glu 
145 150 155 160 

GTT TTA AAA GGC TAT GTT CTT GAA GGA ACT CTA ACT GCT GAA AAA ACA 528 
Val Leu Lys Gly Tyr Val Leu Glu Gly Thr Leu Thr Ala Glu Lys Thr 
165 - 170 175 

ACA TTG GTG GTT AAA GAA GGA ACT GTT ACT TTA AGC AAA AAT ATT TCA 576 
Thr Leu Val Val Lys Glu Gly Thr Val Thr Leu Ser Lys Asn He Ser 
180 185 190 

AAA TCT GGG GAA GTT TCA GTT GAA CTT AAT GAC ACT GAC AGT AGT GCT 624 
Lys Ser Gly Glu Val Ser Val Glu Leu Asn Asp Thr Asp Ser Ser Ala 
195 200 205 
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GCT ACT AAA AAA ACT GCA GCT TGG AAT TCA GGC ACT TCA ACT TTA ACA 672 
Ala Thr Lys Lye Thr Ala Ala Trp Ash Ser Gly Thr Ser Thr Leu Thr 
210 215 220 

ATT ACT GTA AAC AGT AAA AAA ACT AAA GAC CTT GTG TTT ACA AAA GAA 720 
lie Thr Val Aon Ser Lys Lys Thr Lys Asp Leu Val Phe Thr Lys Glu 
225 230 235 240 

AAC ACA ATT ACA GTA CAA CAA TAC GAC TCA AAT GGC ACC AAA TTA GAG 768 
Asn Thr lie Thr Val Gin Gin Tyr Asp Ser Asn Gly Thr Lys Leu Glu 
245 250 255 

GGG TCA GCA GTT GAA ATT ACA AAA CTT GAT GAA ATT AAA AAC GCT TTA 816 
Gly Ser Ala Val Glu lie Thr Lys Leu Asp Glu lie Lys Asn Ala Leu 
260 265 270 

AAA TA 822 
Lys 
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ATG AAA AAA TAT TTA TTG GGA ATA GGT CTA ATA TTA GCC TTA ATA GCA 
TAC TTT TTT ATA AAT AAC CCT TAT CCA GAT TAT AAT CGG AAT TAT CGT 
Met Lys Lys Tyr Leu Leu Gly He Gly Leu He Leu Ala Leu He Ala> 

50 60 70 80 90 

TGT AAG CAA AAT GTT AGC AGC CTT GAT GAA AAA AAT AGC GTT TCA GTA 
ACA TTC GTT TTA CAA TCG TCG GAA CTA CTT TTT TTA TCG CAA AGT CAT 
Cys Lys Gin Asn Val Ser Ser Leu Asp Glu Lys Asn Ser Val Ser Val> 

100 110 120 130 140 

* « * **# * • * 

GAT TTA CCT GGT GGA ATG ACA GTT CTT GTA AGT AAA GAA AAA GAC AAA 
CTA AAT GGA CCA CCT TAC TGT CAA GAA CAT TCA TTT CTT TTT CTG TTT 
Asp Leu Pro Gly Gly Met Thr Val Leu Val Ser Lys Glu Lys Asp Lys> 

150 160 170 180 ISO 

* * * * • • • * • • 

GAC GGT AAA TAC AGT CTA GAG GCA ACA GTA GAC AAG CTT GAG CTT AAA 
CTG CCA TTT ATG TCA GAT CTC CGT TGT CAT CTG TTC GAA CTC GAA TTT 
Asp Gly Lys Tyr Ser Leu Glu Ala Thr Val Asp Lys Leu Glu Leu Lys> 

200 210 220 230 240 

* * * * * * • * * ♦ 

GGA ACT TCT GAT AAA AAC AAC GGT TCT GGA ACA CTT GAA GGT GAA AAA 
CCT TGA AGA CTA TTT TTG TTG CCA AGA CCT TGT GAA CTT CCA CTT TTT 
Gly Thr Ser Asp Lys Asn Asn Gly Ser Gly Thr Leu Glu Gly Glu Lys> 

250 260 270 280 

* * # * * * * * • 

ACT GAC AAA AGT AAA GTA AAA TTA ACA ATT GCT GAT GAC CTA AGT CAA 
TGA CTG TTT TCA TTT CAT TTT AAT TGT TAA CGA CTA CTG GAT TCA GTT 
Thr Asp Lys Ser Lys Val Lys Leu Thr He Ala Asp Asp Leu Ser Gln> 

290 300 310 320 330 

* * • * # • * * • # 

ACT AAA TTT GAA ATT TTC AAA GAA GAT GCC AAA ACA TTA GTA TCA AAA 
TGA TTT AAA CTT TAA AAG TTT CTT CTA CGG TTT TGT AAT CAT AGT TTT 
Thr Lys Phe Glu He Phe Lys Glu Asp Ala Lys Thr Leu Val Ser Lys> 

340 350 360 370 380 

* * * *•* * •* * 

AAA GTA ACC CTT AAA GAC AAG TCA TCA ACA GAA GAA AAA TTC AAC GAA 
TTT CAT TGG GAA TTT CTG TTC AGT AGT TGT CTT CTT TTT AAG TTG CTT 
Lys Val Thr Leu Lys Asp Lys Ser Ser Thr Glu Glu Lys Phe Asn Glu> 
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Osp A K-48 

770 780 790 800 810 

GAA GGC AAA GCA GTC GAA ATT ACA ACA CTT AAA GAA CTT AAA AAC GCT 

CTT CCG TTT CGT CAG CTT TAA TGT TGT GAA TTT CTT GAA TTT TTG CGA 

Glu Gly Lys Ala Val Glu lie Thr Thr Leu Lys Glu Leu Lys Asn Ala> 
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820 
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OSP A PGAU 

10 20 30 40 

* * * * « • * 

ATG AAA AAA TAT TTA TTG GGA ATA GGT CTA ATA TTA GCC TTA ATA GCA 
TAG TTT TTT ATA AAT AAC CCT TAT CCA GAT TAT AAT CGG AAT TAT CGT 
Met Lys Lys Tyr Leu Leu Gly lie Gly Leu lie Leu Ala Leu lie Ala> 

50 60 70 80 90 

TGC AAG CAA AAT GTT AGC AGC CTT GAT GAA AAA- AAC AGC GCT TCA GTA 
ACG TTC GTT TTA CAA TCG TCG GAA CTA CTT TTT TTG TCG CGA AGT CAT 
Cys Lys Gin Asn Val Ser Ser Leu Asp Glu Lys Asn Ser Ala Ser Val> 

100 110 120 130 140 

• * * »♦* •** 

GAT TTG CCT GGT GAG ATG AAA GTT CTT GTA AGT AAA GAA AAA GAC AAA 
CTA AAC GGA CCA CTC TAC TTT CAA GAA CAT TCA TTT CTT TTT CTG TTT 
Asp Leu Pro Gly Glu Met Lys Val Leu Val Ser Lys Glu Lys Asp Lys> 

150 160 170 160 190 

GAC GGT AAG TAC AGT CTA AAG GCA ACA GTA GAC AAG ATT GAG CTA AAA 
CTG CCA TTC ATG TCA GAT TTC CGT TGT CAT CTG TTC TAA CTC GAT TTT 
Asp Gly Lys Tyr Ser Leu Lys Ala Thr Val Asp Lys lie Glu Leu Lys> 

200 210 220 230 240 

* * * * * * * * * * 

GGA ACT TCT GAT AAA GAC AAT GGT TCT GGA GTG CTT GAA GGT ACA AAA 
CCT TGA AGA CTA TTT CTG TTA CCA AGA CCT CAC GAA CTT CCA TGT TTT 
Gly Thr Ser Asp Lys Asp Asn Gly Ser Gly Val Leu Glu Gly Thr Lys> 

250 260 270' • 280 

« «• * • * * * * . 

GAT GAC AAA AGT AAA GCA AAA TTA ACA ATT GCT GAC GAT CTA AGT AAA 
CTA CTG TTT TCA TTT CGT TTT AAT TGT TAA CGA CTG CTA GAT TCA TTT 
Asp Asp Lys Ser Lys Ala Lys Leu Thr lie Ala Asp Asp Leu Ser Lys> 

290 300 310 320 330 

* * « * **••* * * • 

ACC ACA TTC GAA CTT TTA AAA GAA GAT GGC AAA ACA TTA GTG TCA AGA 
TGG TGT AAG CTT GAA AAT TTT CTT CTA CCG TTT TGT AAT CAC AGT TCT 
Thr Thr Phe Glu Leu Leu Lys Glu Asp Gly Lys Thr Leu Val Ser Arg> 

340 350 360 370 380 

AAA GTA AGT TCT AGA GAC AAA ACA TCA ACA GAT GAA ATG TTC AAT GAA 
TTT CAT TCA AGA TCT CTG TTT TGT AGT TGT CTA CTT TAC AAG TTA CTT 
Lys Val Ser Ser Arg Asp Lys Thr Ser Thr Asp Glu Met Phe Asn Glu> 
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ATT AGT GTT AAC AGC AAA AAA ACT ACA CAA CTT GTG TTT ACT AAA CAA 
TAA TCA CAA TTG TCG TTT TTT TGA TGT GTT GAA CAC AAA TGA TTT GTT 
He Ser Val Asn Ser Lys Lys Thr Thr Gin Leu Val Phe Thr Lys Gln> 



730 740 750 760 

* • <* • * * 

TAC ACA ATA ACT GTA AAA CAA TAC GAC TCC GCA GGT ACC AAT TTA GAA 

ATG TGT TAT TGA CAT TTT GTT ATG CTG AGG CGT CCA TGG TTA AAT CTT 

Tyr Thr He Thr Val Lys Gin Tyr Asp Ser Ala Gly Thr Asn Leu Glu> 
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ATG AAA AAA TAT TTA TTG GGA ATA GGT CTA ATA TTA GCT TTA ATA GCA 48 
Met Lys Lys Tyr Leu Leu Gly lie Gly Leu lie Leu Ala Leu lie Ala 
15 10 15 

TGT AAG CAA AAT GTT AGC AGC CTT GAC GAG AAA AAC AGC GTT TCA GTA 96 
Cys Lys Gin Asn Val Ser Ser Leu Asp Glu Lys Asn Ser Val Ser Val 
20 25 30 

GAT TTG CCT GGT GAA ATG AAA GTT CTT GTA AGC AAA GAA AAA GAC AAA 144 
Asp Leu Pro Gly Glu Met Lys Val Leu Val Ser Lys Glu Lys Asp Lys 
35 40 45 

GAC GGC AAG TAC AGT CTA ATG GCA ACA GTA GAC AAG CTT GAG CTT AAA 192 
Asp Gly Lys Tyr Ser Leu Met Ala Thr Val Asp Lys Leu Glu Leu Lys 
50 55 60 
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WO 95/12676 



16/133 y 
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GGA ACA TCT GAT AAA AAC AAT GGA TCT GGG GTG CTT GAA GGC GTA AAA 240 
X? Thr Ser Asp Lys Asn Asn Gly Ser Gly Val Leu Glu Gly Val Lys 
65 70 75 80 

GCT GAC AAA AGC AAA GTA AAA TTA ACA GTT TCT GAC GAT CTA AGC ACA 288 
Ala Asp Lys Ser Lys Val Lys Leu Thr Val Ser Asp Asp Leu Ser Thr 
85 9° 

ACC ACA CTT GAA GTT TTA AAA GAA GAT GGC AAA ACA TTA GTG TCA AAA 336 
Thr Thr Leu Glu Val Leu Lys Glu Asp Gly Lys Thr Leu Val Ser Lys 
100 105 li0 

AAA AGA ACT TCT AAA GAT AAG TCA TCA ACA GAA GAA AAG TTC AAT GAA 
Lys Arg Thr Ser Lys Asp Lys Ser Ser Thr Glu Glu Lys Phe Asn Glu 
115 120 125 

AAA GGC GAA TTA GTT GAA AAA ATA ATG GCA AGA GCA AAC GGA ACC ATA 
Lys Gly Glu Leu Val Glu Lys He Met Ala Arg Ala Asn Gly Thr lie 
130 135 * 40 

CTT GAA TAG ACA GGA ATT AAA AGC GAT GGA TCC GGA AAA GCT AAA GAA 
Leu Glu Tyr Thr Gly He. 
145 ISO 

ACT TTA AAA GAA TAT GTT CTT GAA GGA ACT CTA ACT GCT GAA AAA GCA 528 

Tyr " - -L- * ~ *~ 

165 

576 



384 



432 



480 



624 



CTT GAA TAG ACA GGA ATT AAA AGC GAT GGA ^ XX Z VCe rfu 

Leu Glu Tyr Thr Gly He. Lys Ser Asp Gly Ser Gly Lys Ala Lys Glu 

- — — - 15 5 

ACT TTA AAA GAA TAT GTT CTT GAA GGA ACT CTA ACT GCT GAA AAA GCA 
Thr Leu Lys Glu Tyr Val Leu Glu Gly Thr Leu Thr Ala Glu Lys Ala 

170 I 75 

ACA TTG GTG GTT AAA GAA GGA ACT GTT ACT TTA AGT AAG CAC ATT TCA 
Thr Leu Val Val Lys Glu Gly Thr Val Thr Leu Ser Lys His lie Ser 
180 185 19° 

AAA TCT GGA GAA GTA ACA GCT GAA CTT AAT GAC ACT GAC AGT ACT CAA 
Lvs Ser Gly Glu Val Thr Ala Glu Leu Asn Asp Thr Asp Ser Thr Gin 
* 195 200 205 

GCT ACT AAA AAA ACT GGG AAA TGG GAT GCA GGC ACT TCA ACT TTA ACA 672 
Ala Thr Lys Lys Thr Gly Lys Trp Asp Ala Gly Thr Ser Thr Leu Thr 
210 215 220 

ATT ACT GTA AAC AAC AAA AAA ACT AAA GCC CTT GTA TTT ACA AAA CAA 720 
lie Thr Val Asn Asn Lys Lys Thr Lys Ala Leu Val Phe Thr Lys Gin 
225 230 235 240 

GAC ACA ATT ACA TCA CAA AAA TAC GAC TCA GCA GGA ACC AAC TTG GAA 768 
Asp Thr lie Thr Ser Gin Lys Tyr Asp Ser Ala Gly Thr Asn Leu Glu 
245 250 255 

GGC ACA GCA GTC GAA ATT AAA ACA CTT GAT GAA CTT AAA AAC GCT TTA 816 
Gly Thr Ala Val Glu He Lys Thr Leu Asp Glu Leu Lys Asn Ala Leu 
260 265 270 
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Sequence ?.ange: 1 to 891 



10 



20 30 



ATG AG A TTA TTA ATA GGA TTT GCT TTA GCG TTA GC7 TTA ATA GGA TGT 
TAC TC7 AAT AAT TAT CCT AAA CGA AAT CGC AAT CGA AAT TAT CCT ACA 
Met Arg Leu Leu lie Gly Phe Ala Leu Ala Leu Ala Leu lie Gly Cys> 



50 60 



70 80 90 



GCA CAA AAA GGT GCT GAG TCA ATT GGT TCT CAA AAA GAA AAT GAT CTA 
CGT GTT TTT CCA CGA CTC AGT TAA CCA AGA GTT TTT CTT TTA CTA GAT 
Ala Gin Lys Gly Ala C-lu Ser lie Gly Ser Gin Lys Glu Asn Asp Leu> 



100 



HO 120 130 Ui 



AAC CTT GAA GAC TCT AGT AAA AAA TCA CAT CAA AAC GCT AAA CAA GAC 
TTG GAA CTT CTG AGA TCA TTT TTT AGT GTA GTT TTG CGA TTT GTT CTC- 
Asn Leu Glu Asp Ser Ser Lys Lys Ser His Gin Asr. Ala Lys Gin Asp> 



150 



160 170 ISO 190 



CTT CCT GCG GTG ACA GAA GAC TCA GTG TCT TTG TTT AAT GGT AAT AAA 
GAA GGA CGC CAC TGT CTT CTG AGT CAC AGA AAC AAA TTA CCA TTA TTT 
Leu P-o Ala Val Thr Glu Asp Ser Val Ser Leu Phe Asr. Gly Asn Lys> 



200 



210 220 22C 24C 



* » 

ATT _ T GTA AGC AAA GAA AAA AAT AGC TCC GGC AAA TAT GAT TTA AGA 
TAA AAA CAT TCG TTT CTT TTT TTA TCG AGG CCG TTT ATA CTA AAT Te- 
lle Phe val Ser Lys Glu Lys Asn Ser Ser Gly Lys Tyr Asp Leu Arc> 

250 260 270 2S0 

- * • * * * * 

« * 

GCA ACA ATT GAT CAG GTT GAA CTT AAA GGA ACT TCC GAT AAA AAC AAT 

CGT TGT TAA CTA GTC CAA CTT GAA TTT CCT TCA AGG CTA TTT TTG TTA 

Ala Thr He Asp Gin Val Glu Leu Lys Gly Thr Ser As? Lys Asn Asn> 



290 



300 310 320 330 



• ■» - 

GGT TCT GGA ACC CTT GAA GGT TCA AAG CCT GAC AAG AGT AAA GTA AAA 
CCA AGA CCT TGG GAA CTT CCA AGT TTC GGA CTG TTC TCA TTT CAT TTT 
Gly Ser Gly Thr Leu Glu Gly Ser Lys Pro Asp Lys Ser Lys Val Lys> 



340 



350 360 370 380 



« » 

TTA ACA GTT TCT GCT GAT TTA AAC ACA GTA ACC TTA GAA GCA TTT GAT 
AAT TGT CAA AGA CGA CTA AAT TTG TGT CAT TGG AAT CTT CGT AAA CTA 
Leu Thr Val Ser Ala Asp Leu Asn Thr Val Thr Leu Glu Ala Phe As?> 
390 400 410 420 430 
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GCC AGC AAC CAA AAA ATT TCA AGT AAA GTT ACT AAA AAA CAG GGG TCA 
CGG TCG TTG GTT TTT TAA AGT TCA TTT CAA TGA TTT TTT GTC CCC AGT 
Ala Ser Asn Gin Lys lie Ser Ser Lys Val 7hr Lys Lys Gin Gly Ser> 

44 0 450 460 470 4£D 

ATA ACA GAG GAA ACT CTC AAA GCT AAT AAA TTA GAC TCA AAG AAA TTA 
TAT TGT CTC CTT TGA GAG TTT CGA TTA TTT AAT CTG AGT TTC TTT AAT 
He Thr Glu Glu Thr Leu Lys Ala Asn Lys Leu Asp Ser Lys Lys Leu> 

490 500 510 520 

* 

ACA AGA TCA AAC GGA ACT ACA CTT GAA TAC TCA CAA ATA ACA GAT GCT 
TGT TCT AGT TTG CCT TGA TGT GAA CTT ATG AGT GTT TAT TGT CTA CGA 
Thr Arg Ser Asn Gly Thr Thr Leu Glu Tyr Ser Gin He Thr Asp Ala> 



530 



540 550 5€0 570 



GAC AAT GCT ACA AAA GCA GTA GAA ACT CTA AAA AAT AGC ATT AAG CTT 
CTG TTA CGA TGT TTT CGT CAT CTT TGA GAT TTT TTA TCG TAA TTC GAA 
Asp Asn Ala Thr Lys Ala Val Glu Thr Leu Lys Asn Ser He Lys Leu> 

590 590 600 610 620 

* * * 

GAA GGA AGT CTT GTA GTC GGA AAA ACA ACA GTG GAA ATT AAA GAA GGT 
CTT CCT TCA GAA CAT CAG CCT TTT TGT TGT CAC CTT TAA TTT CTT CCA 
Glu Gly Ser Leu Val Val Gly Lys Thr Thr Val Glu lie Lys Glu Gly> 

630 640 650 660 €70 

ACT GTT ACT CTA AAA AGA GAA ATT GAA AAA GAT GGA AAA GTA AAA GTC 
TGA CAA TGA GAT TTT TCT CTT TAA CTT TTT CTA CCT TTT CAT TTT CAG 
Thr Val Thr Leu Lys Arg Glu He Glu Lys Asp Gly Lys Val Lys Vsl> 

680 690 700 710 720 

*«•***••** 

TTT TTG AAT GAC ACT GCA GGT TCT AAC AAA AAA ACA GGT AAA TGG GAA 
AAA AAC TTA CTG TGA CGT CCA AGA TTG TTT TTT TGT CCA TTT ACC CTT 
Phe Leu Asn Asp Thr Ala Gly Ser Asn Lys Lys Thr Gly Lys Trp Glu> 

730 740 750 760 

% * • * * * * * « 

GAC AGT ACT AGC ACT TTA ACA ATT AGT GCT GAC AGC AAA AAA ACT AAA 
CTG TCA TGA TCG TGA AAT TGT TAA TCA CGA CTG TCG TTT TTT TGA TTT 
Asp Ser Thr Ser Thr Leu Thr He Ser Ala Asp Ser Lys Lys Thr Lys> 

770 780 790 800 610 

* * * * * * 

GAT TTG GTG TTC TTA ACA GAT GGT ACA ATT ACA GTA CAA CAA TAC AAC 
CTA AAC CAC AAG AAT TGT CTA CCA TGT TAA TGT CAT GTT GTT ATG TTG 
Asp Leu Val Phe Leu Thr Asp Gly Thr He Thr Val Gin Glr. Tyr Asn> 
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820 830 



940 850 860 



ACA GCT GGA ACC AGC CTA GAA GGA 7CA GCA AGT GAA ATT AAA AAT CTT 
faT cZ CCT TGG TCG GAT CTT CCT AGT CGT TCA CTT TAA TTT TTA GAA 
III Ala Gly Tht Ser Leu Glu Cly Ser Ala Ser Glu He Lys Asn Leu> 



870 880 890 

TCA GAG CTT AAA AAC GCT TTA AAA TAA 
AGT CTC GAA TTT TTG CGA AAT TTT ATT 
Ser Glu Leu Lys Asn Ala Leu Lys ***> 
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OopC-B31 

Sequence Range: 1 to 633 



10 20 30 



40 



ill *ag AAT ACA TTA AGT GCG ATA TTA ATG ACT TTA TTT TTA TTT 
^ £t TO £1 t£ Ht TCA CGC TAT AAT TAG TGA AAT AAA AAT AAA 
let SI Lys Asn Thr Leu Ser Ala lie Leu Met Thr Leu Phe Leu Phe> 



50 60 



70 80 90 



ATA TCT TGT AAT AAT TCA GGG AAA GAT GGG AAT ACA TCT GCA AAT TCT 
tIt HI IS £a TTA AGT CCC TTT CTA CCC TTA TGT AGA CCT TTA AGA 
111 Ser Cys Asn Asn Ser Gly Lys Asp Gly A^rrThr Ser Ala Asr. Ser> 



100 no 



120 130 140 



GCT GAT GAG TCT GTT AAA GGG CCT AAT CTT ACA GAA ATA AGT AAA AAA 
cS SI Sc AGA CAA TTT CCC GGA TTA GAA TGT CTT TAT TCA TTT TTT 
Ala Sp Glu Ser Val Lys Gly Pro Asn Leu Thr Glu lie £ er Lys L ys> 



150 160 



170 180 190 



* 



ATT ACG GAT TCT AAT GCG GTT TTA CTT GCT GTG AAA GAG Gi i GA*. GCG 
TAA TGC CTA AGA TTA CGC CAA AAT GAA CGA CAC TTT CTC CAA CTT CGC 
51 Sr Sp Ser Asn Ala Val Leu Leu Ala Val Lys Glu Val Glu Ala> 



200 210 



220 230 240 



TTG CTG TCA TCT ATA GAT GAA ATT GCT GCT AAA GCT ATT GGT AAA AAA 
AAC GAC AGT AGA TAT CTA CTT TAA CGA CGA TTT CGA TAA CCA TTT TTT 
Leu Leu Ser Ser He Asp Glu He Ala Ala Lys Ala He Gly Lys Lys> 



250 



260 270 280 



ATA CAC CAA AAT AAT GGT TTG GAT ACC GAA TAT AAT CAC AAT GGA TCA 
TAT GTG GTT TTA TTA CCA AAC CTA TGG CTT ATA TTA GTG TTA CCT AGT 
lie His Gin Asn Asn Gly Leu Asp Thr Glu Tyr Asn His Asn Gly Ser> 



290 300 



310 320 330 



* 

TTG TTA GCG GGA CGT TAT GCA ATA TCA ACC CTA ATA AAA CAA AAA TTA 
AAC AAT CGC CCT GCA ATA CGT TAT AGT TGG GAT TAT TTT GTT TTT AAT 
Leu Leu Ala Gly Arg Tyr Ala lie Ser Thr Leu He Lys Gin Lys Leu> 



340 



350 360 370 380 



GAT GGA TTG AAA AAT GAA GGA TTA AAG GAA AAA ATT GAT GCG GCT AAG 
CTA CCT AAC TTT TTA CTT CCT AAT TTC CTT TTT TAA CTA CGC CGA TTC 
Asp Gly Leu Lys Asn Glu Gly Leu Lys Glu Lys He Asp Ala Ala Lys> 
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OepC-B31 



390 



400 410 420 430 



AAA TGT TCT GAA ACA 2TT ACT AAT AAA TTA AAA GAA AAA CAC ACA GAT 
TTT ACA AGA CTT TGT AAA TGA TTA TTT AAT TTT CTT TTT GTG TGT CTA 
Lys Cys Ser Glu Thr Phe Thr Asn Lys Leu Lys Glu Lys His Thr Asp> 

440 450 460 470 480 

* ' • * * 

CTT GGT AAA GAA GGT GTT ACT GAT GCT GAT GCA AAA GAA GCC ATT TTA 
GAA CCA TTT CTT CCA CAA TGA CTA CGA CTA CGT TTT CTT CGG TAA AAT 
Leu Gly Lys Glu Gly Val Thr Asp Ala Asp Ala Lys Glu Ala lie Leu> 

490 500 510 . " 520 

. * «• 

AAA ACA AAT GGT ACT AAA ACT AAA GGT GCT GAA GAA CTT GGA AAA TTA 
TTT TGT TTA CCA TGA TTT TGA TTT CCA CGA CTT CTT GAA CCT TTT AAT 
Lys Thr Asn Gly Thr Lys Thr Lys Gly Ala Glu Glu Leu Gly Lys Leu> 

530 540 550 560 570 



TTT GAA TCA GTA GAG GTC TTG TCA AAA GCA GCT AAA GAG ATG CTT GCT 
AAA CTT AGT CAT CTC CAG AAC AGT TTT CGT CGA TTT CTC TAC GAA CGA 
Phe Glu Ser Val Glu Val Leu Ser Lys Ala Ala Lys Glu Met Leu Ala> 



580 



590 600 610 620 



* 



AAT TCA GTT AAA GAG CTT ACA AGC CCT GTT GTG GCA GAA AGT CCA AAA 
TTA AGT CAA TTT CTC GAA TGT TCG GGA CAA CAC CGT CTT TCA GGT TTT 
Asn Ser Val Lys Glu Leu Thr Ser Pro Val Val Ala Glu Ser Pro Lys> 



630 
* * 

AAA CCT TAA 
TTT GGA ATT 
Lys Pro ***> 
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Sequence Range: 1 to 630 ' 
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Lys Asn 


Thr 


Leu 


Ser 


Ala 


He 


Leu 


Met 


Thr 


Leu Phe 


Leu 


Phe> 



50 






60 






70 






80 






90 






« • 




* 


* 




* 




* 


* 




* 




* 






* 


ATA 


TCT 


TGT 


AAT 


AAT 


TCA 


GGT 


GGG 


GAT 


ACC 


GCA 


TCT 


ACT 


AAT 


CCT 


GAT 


TAT 


AGA 


ACA 


TTA 


TTA 


AGT 


CCA 


CCC 


CTA 


TGG 


CGT 


AGA 


TGA 


TTA 


GGA 


CTA 


He 


Ser 


Cys 


Asn 


Asn 


Ser 


Gly 


Gly Asp 


Thr 


Ala 


Ser 


Thr 


Asn 


Pro 


Asp> 


100 




110 






120 






130 




140 






* 


* 




* 






* 




* 




* 


<• 




■ * 




GAG 


TCT 


GCA 


AAA 


GGA 


CCT 


AAT 


CTT 


ACA 


GTA 


ATA 


AGC 


AAA 


AAA 


ATT 


ACA 


CTC 


AGA 


CGT 


TTT 


CCT 


GGA 


TTA 


GAA 


TGT 


CAT 


TAT 


TCG 


TTT 


TTT 


TAA 


TGT 


Glu 


Ser 


Ala 


Lys 


Gly 


Pro 


Asn 


Leu 


Thr 


Val 


He 


Ser 


Lys 


Lys 


He 


Thr> 




150 






160 




170 






180 






190 


* 


* 




* 




* 


* 




* 




* . 


* 




* 




• 


GAT 


TCT 


AAT 


GCA 


TTT 


GTA 


CTG 


GCT 


GTG 


AAA 


GAA 


GTT 


GAG 


GCT 


TTG 


ATC 


CTA 


AGA 


TTA 


CGT 


AAA 


CAT 


GAC 


CGA 


CAC 


TTT 


CTT 


CAA 


CTC 


CGA 


AAC 


TAG 


Asp 


Ser 


Asn 


Ala 


Phe 


Val 


Leu 


Ala 


Val 


Lys 


Glu 


Val 


Glu 


Ala 


Leu 


Ile> 




200 






210 






220 




230 






240 


-* 




* 




* 


* 




* 




* 


* 




• 




* 


* 


TCA 


TCT 


ATA 


GAT 


GAA 


CTT 


GCT 


AAT 


AAA 


GCT 


ATT 


GGT 


AAA 


GTA 


ATA 


CAT 


AGT 


AGA 


TAT 


CTA 


CTT 


GAA 


CGA 


TTA 


TTT 


CGA 


TAA 


CCA 


TTT 


CAT 


TAT 


GTA 


Ser 


Ser 


He 


Asp 


Glu 


Leu 


Ala 


Asn 


Lys 


Ala 


He 


Gly 


Lys 


Val 


He 


His> 






250 




260 






270 






280 








* 




•* 


* 




• 




* 


* 




* 




* 


« 




CAA 


AAT 


AAT 


GGT 


TTA 


AAT 


GCT 


AAT 


GCG 


GGT 


CAA 


AAC 


GGA 


TCA 


TTG 


TTA 


GTT 


TTA 


TTA 


CCA 


AAT 


TTA 


CGA 


TTA 


CGC 


CCA 


GTT 


TTG 


CCT 


AGT 


AAC 


AAT 


Gin 


Asn 


Asn 


Gly 


Leu 


Asn 


Ala 


Asn 


Ala Gly 


Gin 


Asn Gly 


Ser 


Leu 


Leu> 


290 






300 






310 




320 






330 






* 




* 






* 




• 


* 




* 




• 






• 


GCA 


GGA 


GCC 


TAT 


GCA 


ATA 


TCA 


ACC 


CTA 


ATA 


ACA 


GAA 


AAA 


TTA 


AGT 


AAA 


CGT 


CCT 


CGG 


ATA 


CGT 


TAT 


AGT 


TGG 


GAT 


TAT 


TGT 


CTT 


TTT 


AAT 


TCA 


TTT 


Ala 


Gly Ala 


Tyr 


Ala 


He 


Ser 


Thr 


Leu 


He 


Thr 


Glu 


Lys 


Leu 


Ser 


Lys> 


340 




350 






360 






370 




380 






* 


• 




* 




* 


• 




* 




* 






« 




TTG 


AAA 


AAT 


TCA 


GAA 


GAG 


TTA 


AAT 


AAA 


AAA 


ATT 


GAA 


GAG 


GCT 


AAG 


AAC 


AAC 


TTT 


TTA 


AGT 


CTT 


CTC 


AAT 


TTA 


TTT 


TTT 


TAA 


CTT CTC 


CGA 


TTC 


TTG 


Leu 


Lys 


Asn 


Ser 


Glu 


Glu 


Leu 


Asn 


Lys 


Lys 


He 


Glu 


Glu 


Ala 


Lys 


Asn> 
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WO 95/12676 PCT/US94/12352 



OepC-Kfl 8 





390 






400 




410 






420 






430 


* 


* 




* 




• 


* 




* 




« 


* 








• 


CAT 


TCT 


GAA 


GCA 


X 1 A 


ACT 


AAT 


AGA 


CTA 


AAA 


GGT 


TCT 


CAT 


GCA 


CAA 


CTT 




AGA 


CTT 


CGT 


AAA 


TGA 


TTA 


TCT 


GAT 


TTT 


CCA 


AGA 


GTA 


CGT 


GTT 


GAA 


His 


Ser 


Glu 


Ala 


Phe 


Thr 


Asn 


Arg 


Leu 


Lys 


Gly 


Ser 


His 


Ala 


Gin 


Leu> 




440 






450 






460 




470 






480 






* 






* 




* 




* 


* 




* 








GGA 


GTT 


GCT 


GCT 




ACT 


GAT 


GAT 


LAI 


GCA 


AAA 


GAA 


GCT 


ATT 


1 In 


AAG 




CAA 


CGA 


CGA 




TGA 


CTA 


CTA 


GTA 


CGT 


TTT 


CTT 


CGA 


TAA 


AAT 


TTC 


g iy 


Val 


Ala 


Ala 


Ala 


Thr 


Asp 


Asp 


nib 


Ala 


Lys 


Glu 


Ala 


lie 


Leu 


Lys> 






490 




500 






510 






520 








* 




* 


* 




• 




« 


• * 




" '•m 




* 


* 




TCA 


AAT 


CCT 


ACT 


AAA 


GAT 


AAG 


GGT 


GCT 


AAA 


GCA 


CTT 


AAA 


GAC 


TTA 


TCT 


AGT 


TTA 


GGA 


TGA 


TTT 


CTA 


TTC 


CCA 


CGA 


TTT 


CGT 


GAA 


TTT 


CTG 


AAT 


AGA 


Ser 


Asn 


Pro 


Thr 


Lys 


Asp 


Lys 


Gly 


Ala 


Lys 


Ala 


Leu 


Lys 


Asp 


Leu 


Ser> 


530 






S40 






550 




560 






570 






* 




* 


* 




<* 




* 


* 




* 




•* 


• 




* 


GAA 


TCA 


GTA 


GAA 


AGC 


TTG 


GCA 


AAA 


GCA 


GCG 


CAA 


GAA 


GCA 


TTA 


GCT 


AAT 


CTT 


AGT 


CAT 


CTT 


TCG 


AAC 


CGT 


TTT 


CGT 


CGC 


GTT 


CTT 


CGT 


AAT 


CGA 


TTA 


Glu 


Ser 


Val 


Glu 


Ser 


Leu 


Ala 


Lys 


Ala 


Ala 


Gin 


Glu 


Ala 


Leu 


Ala 


Asn> 


580 




590 






600 






610 




620 






* 


* 




* 




* 


* 




* 




• 


* 




* 




TCA 


GTT 


AAA 


GAA 


CTT 


ACA 


AAT 


CCT 


GTT 


GTG 


GCA 


GAA 


AGT 


CCA 


AAA 


AAA 


AGT 


CAA 


TTT 


CTT 


GAA 


TGT 


TTA 


GGA 


CAA 


CAC 


CGT 


CTT 


TCA 


GGT 


TTT 


TTT 


Ser 


Val 


Lys 


Glu 


Leu 


Thr 


Asn 


Pro 


Val 


Val 


Ala 


Glu 


Ser 


Pro 


Lys 


Lys> 



630 
* « 

CCT TAA 
GGA ATT 
Pro ***> 



FIGURE 13 (2 of 2) 



BNSDOC1D- <VVO 9512676A1> 



WO 95/12676 



PCTYUS94/12352 



OapC-PKO 

Sequence Range: 1 to 639 

10 20 
• * * * 

ATG AAA AAG AAT ACA TTA AGT GCG 
TAG TTT TTC TTA TGT AAT TCA CGC 
Met Lys Lys Asn Thr Leu Ser Ala 

50 60 70 

***** 

ATA TCT TGT AGT AAT TCA GGG AAA 
TAT AGA ACA TCA TTA AGT CCC TTT 
lie Ser Cys Ser Asn Ser Gly Lys 

100 110 120 

* * * * * 

CCT GCT GAC GAG TCT GCG AAA GGG 
GGA CGA CTG CTC AGA CGC TTT CCC 
Pro Ala Asp Glu Ser Ala Lys Gly 



30 40 

* * * * * 

ATA TTA ATG ACT TTA TTT TTA TTT 
TAT AAT TAC TGA AAT AAA AAT AAA 
lie Leu Met Thr Leu Phe Leu Phe> 

80 90 
* * * * * 

GGT GGG GAT TCT GCA TCT ACT AAT 
CCA CCC CTA AGA CGT AGA TGA TTA 
Gly Gly Asp^Ser Ala Ser Thr Asn> 

130 140 
• * * * 

CCT AAT CTT ACA GAA ATA AGC AAA 
GGA TTA GAA TGT CTT TAT TCG TTT 
Pro Asn Leu Thr Glu lie Ser Lys> 



150 160 170 180 190 

* < * ***** * * 

AAA ATT ACA GAT TCT AAT GCA TTT GTA CTT GCT GTT AAA GAA GTT GAG 

TTT TAA TGT CTA AGA TTA CGT AAA CAT GAA CGA CAA TTT CTT CAA CTC 

Lys lie Thr Asp Ser Asn Ala Phe Val Leu Ala Val Lys Glu Val Glu> 



200 210 
***** 

ACT TTG GTT TTA TCT ATA GAT GAA 
TGA AAC CAA AAT AGA TAT CTA CTT 
Thr Leu Val Leu Ser lie Asp Glu 



220 230 240 

• * * * * 

CTT GCT AAG AAA GCT ATT GGT CAA 
GAA CGA TTC TTT CGA TAA CCA GTT 
Leu Ala Lys Lys Ala lie Gly Gln> 



250 260 
* * * * 

AAA ATA GAC AAT AAT AAT GGT TTA 
TTT TAT CTG TTA TTA TTA CCA AAT 
Lys lie Asp Asn Asn Asn Gly Leu 

290 300 310 

* * * * * 

TCG TTG TTA GCA GGA GCC TAT GCA 
AGC AAC AAT CGT CCT CGG ATA CGT 
Ser Leu Leu Ala Gly Ala Tyr Ala 



270 280 
• * * * * 

GCT GCT TTA AAT AAT CAG AAT GGA 
CGA CGA AAT TTA TTA GTC TTA CCT 
Ala Ala Leu Asn Asn Gin Asn Gly> 

320 330 
* * * * * 

ATA TCA ACC CTA ATA ACA GAA AAA 
TAT AGT TGG GAT TAT TGT CTT TTT 
He Ser Thr Leu He Thr Glu Lys> 



340 350 360 

* * * * * 

TTG AGT AAA TTG AAA AAT TTA GAA 
AAC TCA TTT AAC TTT TTA AAT CTT 
Leu Ser Lys Leu Lys Asn Leu Glu 



370 380 
* * * * 

GAA TTA AAG ACA GAA ATT GCA AAG 
CTT AAT TTC TGT CTT TAA CGT TTC 
Glu Leu Lys Thr Glu He Ala Lys> 



FIGURE LA (1 of 2) 



WO 95/12676 




PCT/US94/12352 



OspC-PKO 

390 m 400 410 # 420 ^ 430 

' AAA TGT TCC GAA GAA TTT ACT AAT AAA CTA AAA AGT GGT CAT 

cZ KG TTT ACA AGG CTT CTT AAA TGA TTA TTT GAT TTT TCA CCA GTA 
Ala Lys Lys Cys Ser Glu Glu Phe Thr Asn Lys Leu Lys Ser Gly Hxs> 

440 450 460 470 480 

CCA GAT CTT GGC AAA CAG GAT GCT ACC GAT GAT CAT GCA AAA GCA GCT 
CGT CTA GAA CCG TTT GTC CTA CGA TGG CTA CTA GTA CGT TTT CGT CGA 
Ala Asp Leu Gly Lys Gin Asp Ala Thr Asp Asp His Ala Lys Ala Ala> 



490 



500 510 ~* 520 



ATT TTA AAA ACA CAT GCA ACT ACC GAT AAA GGT GCT AAA GAA TTT AAA 
TAA AAT TTT TGT GTA CGT TGA TGG CTA TTT CCA CGA TTT CTT AAA TTT 
He Leu Lys Thr His Ala Thr Thr Asp Lys Gly Ala Lys Glu Phe Lys> 



530 540 



550 560 570 



TAT TTA TTT GAA TCA GTA GAA GGT TTG TTA AAA GCA GCT CAA GTA GCA 
CTA AAT AAA CTT AGT CAT CTT CCA AAC AAT TTT CGT CGA GTT CAT CGT 
Asp Leu Phe Glu Ser Val Glu Gly Leu Leu Lys Ala Ala Gin Val Ala> 



580 590 



600 610 620 



CTA ACT AAT TCA GTT AAA GAA CTT ACA AGT CCT GTT GTA GCA GAA AGT 
GAT TGA TTA AGT CAA TTT CTT GAA TGT TCA GGA CAA CAT CGT CTT TCA 
Leu Thr Asn Ser Val Lys Glu Leu Thr Ser Pro Val Val Ala Glu Ser> 

630 

# * * 

CCA AAA AAA CCT TAA 
GGT TTT TTT GGA ATT 
Pro Lys Lys Pro ***> 



FIGURE 14 (2 of 2) 



BNS0OCID <WO 9512676A1> 



WO 95/12676 



PCT/US94/12352 



OapC-TRO ' 
Sequence Range: 1 to 624 



10 20 30 40 





* 




* 


* 




* 




* 


* 




<* 




• 


* 




ATG 


AAA 


AAG 


AAT 


ACA 


TTA 


AGT 


GCG 


ATA 


TTA 


ATG 


ACT 


TTA 


TTT 


TTA 


TTT 


TAC 


TTT 


TTC 


TTA 


TGT 


AAT 


TCA 


CGC 


TAT 


AAT 


TAC 


TGA 


AAT 


AAA 


AAT 


AAA 


Met 


Lys 


Lys 


Asn 


Thr 


Leu 


Ser 


Ala 


He 


Leu 


Met 


Thr 


Leu 


Phe 


Leu 


Phe> 


50 






60 






70 






80 






90 






« 




* 


♦ 




* 




* 


* 








* 


* 




* 


ATA 


TCT 


TGT 


AAT 


AAT 


TCA 


GGT 


GGG 


GAT 


TCT 


GCA 


TCT 


ACT 


AAT 


CCT 


GAT 


TAT 


AGA 


ACA 


TTA 


TTA 


AGT 


CCA 


CCC 


CTA 


AGA 


CGT 


AGA 


TGA 


TTA 


GGA 


CTA 


lie 


Ser 


Cys 


Asn 


Asn 


Ser Gly Gly 


Asp 


Ser 


Ala^Ser 


Thr 


Asn 


Pro 


Asp> 


100 




110 






120 






130 




140 






* 


* 




* 






• 




* 




* 


• 




* 




GAG 


TCT 


GCA 


AAA 


GGA 


CCT 


AAT 


CTT 


ACC 


GTA 


ATA 


AGC 


AAA 


AAA 


ATT 


ACA 


CTC 


AGA 


CGT 


TTT 


CCT 


GGA 


TTA 


GAA 


TGG 


CAT 


TAT 


TCG 


TTT 


TTT 


TAA 


TGT 


Glu 


Ser 


Ala 


Lys 


Gly 


Pro 


Asn 


Leu 


Thr 


Val 


He 


Ser 


Lys 


Lys 


He 


Thr> 




150 






160 




170 






180 






190 




• 




• 




* 


* 




* 




• 


* 




* 




* 


GAT 


TCT 


AAT 


GCA 


TTT 


TTA 


CTG 


GCT 


GTG 


AAA 


GAA 


GTT 


GAG 


GCT 


TTG 


CTT 


CTA 


AGA 


TTA 


CGT 


AAA 


AAT 


GAC 


CGA 


CAC 


TTT 


CTT 


CAA 


CTC 


CGA 


AAC 


GAA 


Asp 


Ser 


Asn 


Ala 


Phe 


Leu 


Leu 


Ala 


Val 


Lys 


Glu 


Val 


Glu 


Ala 


Leu 


Leu> 




200 






210 






220 




230 






240 


« 




• 




* 


• 




• 




* 


* 




* 




* 


* 


TCA 


TCT 


ATA 


GAT 


GAA 


CTT 


TCT 


AAA 


GCT 


ATT 


GGT 


AAA 


AAA 


ATA 


AAA 


AAT 


AGT 


AGA 


TAT 


CTA 


CTT 


GAA 


AGA 


TTT 


CGA 


TAA 


CCA 


TTT 


TTT 


TAT 


TTT 


TTA 


Ser 


Ser 


lie 


Asp 


Glu 


Leu 


Ser 


Lys 


Ala 


He Gly 


Lys 


Lys 


He 


Lys 


Asn> 






250 




260 






270 






280 








* 




* 


* 




• 




* 


* 




* 




* 


* 




GAT 


GGT 


ACT 


TTA 


GAT 


AAC 


GAA 


GCA 


AAT 


CGA 


AAC 


GAA 


TCA 


TTG 


ATA 


GCA 


CTA 


CCA 


TGA 


AAT 


CTA 


TTG 


CTT 


CGT 


TTA 


GCT 


TTG 


CTT 


AGT 


AAC 


TAT 


CGT 


Asp Gly Thr 


Leu 


Asp 


Asn 


Glu 


Ala 


Asn 


Arg 


Asn 


Glu 


Ser 


Leu 


He 


Ala> 


290 






300 






310 




320 






330 






* 




* 


* 




* 




* 


# 








* 


* 




* 


GGA 


GCT 


TAT 


GAA 


ATA 


TCA 


AAA 


CTA 


ATA 


ACA 


CAA 


AAA 


TTA 


AGT 


GTA 


TTG 


CCT 


CGA 


ATA 


CTT 


TAT 


AGT 


TTT 


GAT 


TAT 


TGT 


GTT 


TTT 


AAT 


TCA 


CAT 


AAC 


Gly Ala Tyr 


Glu 


He 


Ser 


Lys 


Leu 


He 


Thr 


Gin 


Lys 


Leu 


Ser 


Val 


Leu> 



340 350 360 370 380 

* * * * * 

AAT TCA GAA GAA TTA AAG AAA AAA ATT AAA GAG GCT AAG GAT TGT TCC 

TTA AGT CTT CTT AAT TTC TTT TTT TAA TTT CTC CGA TTC CTA ACA AGG 

Asn Ser Glu Glu Leu Lys Lys Lys He Lys Glu Ala Lys Asp Cys Ser> 



FIGURE 15 (1 of 2) 



WO 95/12676 PCT/US94/12352 



OspC-TRO 



390 



400 410 420 430 



GAA AAA TTT ACT ACT AAG CTA AAA GAT AGT CAT GCA GAG CTT GGT ATA 
CTT TTT AAA TGA TGA TTC GAT TTT CTA TCA GTA CGT CTC GAA CCA TAT 
Glu Lys Phe Thr Thr Lys Leu Lys Asp Ser His Ala Glu Leu Gly Ile> 



440 450 



460 470 480 



CAA AGC GTT CAG GAT GAT AAT GCA AAA AAA GCT ATT TTA AAA ACA CAT 
GTT TCG CAA GTC CTA CTA TTA CGT TTT TTT CGA TAA AAT TTT TGT GTA 
Gin Ser Val Gin Asp Asp Asn Ala Lys Lys Ala He Leu Lys Thr His> 



490 



500 510 "~ " 520 



GGA ACT AAA GAC AAG GGT GCT AAA GAA CTT GAA GAG TTA TTT AAA TCA 
CCT TGA TTT CTG TTC CCA CGA TTT CTT GAA CTT CTC AAT AAA TTT AGT 
Gly Thr Lys Asp Lys Gly Ala Lys Glu Leu Glu Glu Leu Phe Lys Ser> 



530 540 



550 560 570 



CTA GAA AGC TTG TCA AAA GCA GCG CAA GCA GCA TTA ACT AAT TCA GTT 
GAT CTT TCG AAC AGT TTT CGT CGC GTT CGT CGT AAT TGA TTA AGT CAA 
Leu Glu Ser Leu Ser Lys Ala Ala Gin Ala Ala Leu Thr Asn Ser Val> 



S80 



590 - 600 610 620 



• w - 

AAA GAG CTT ACA AAT CCT GTT GTG GCA GAA AGT CCA AAA AAA CCT TAA 

TTT CTC GAA TGT TTA GGA CAA CAC CGT CTT TCA GGT TTT TTT GGA ATT 

Lys Glu Leu Thr Asn Pro Val Val Ala Glu Ser Pro Lys Lys Pro •••> 



FIGURE 15 (2 of 2) 



BNSOOCID'<WO 9S12676A1> 



WO 95/12676 PCT/US94/123S2 



P93 

Sequence Range: 1 to 2102 

10 



J.*//33 



20 30 40 



ATG AAA AAA ATG TTA CTA ATC TTT AGT TTT TTT CTT ATT TTC TTG AAT 
TAC TTT TTT TAC AAT GAT TAG AAA TCA AAA AAA GAA TAA AAG AAC TTA 
Met Lys Lys Met Leu Leu lie Phe Ser Phe Phe Leu lie Phe Leu Asn> 

50 60 10 80 90 

TTT CCT GTT AGT GCA AGA GAA GTT GAT AGG GAA AAA TTA AAG GAC 
CCT AAA GGA CAA TCA CGT TCT CTT CAA CTA TCG-CTT TTT AAT TTC CTG 
Gly Phe Pro Val Ser Ala Arg Glu Val Asp Arg Glu Lys Leu Lys Asp> 



100 



110 120 130 140 



TTT GTT AAT ATG GAT CTT GAG TTT GTA AAT TAT AAA GGC CCT TAT GAT 
AAA CAA TTA TAC CTA GAA CTC A CAT TTA ATA TTT CCG GGA ATA CTA 
Phe Val Asn Met Asp Leu Glu Phe Val Asn Tyr Lys Gly Pro Tyr Asp> 



150 



160 170 180 190 



TCT ACA AAT ACA TAT GAA CAA ATA GTG GGT ATT GGG GAG TTT TTA GCA 
AGA TGT TTA TGT ATA CTT GTT TAT CAC CCA TAA CCC CTC AAA AAT CGT 
Ser Thr Asn Thr Tyr Glu Gin lie Val Gly lie Gly Glu Phe Leu Ala> 



200 



210 220 230 240 



AGA CCG TTG ACC AAT TCC AAT AGC AAC TCA AGT TAT TAT GGT AAA TAT 
TCT GGC AAC TGG TTA AGG TTA TCG TTG AGT TCA ATA ATA CCA TTT ATA 
Arg Pro Leu Thr Asn Ser Asn Ser Asn Ser Ser Tyr Tyr Gly Lys Tyr> 



250 



260 270 280 



TTT ATT AAT AGA TTT ATT GAT GAT CAA GAT AAA AAA GCA AGC GTT GAT 
AAA TAA TTA TCT AAA TAA CTA CTA GTT CTA TTT TTT CGT TCG CAA CTA 
Phe He Asn Arg Phe He Asp Asp Gin Asp Lys Lys Ala Ser Val Asp> 

290 300 310 320 330 

GTT TTT TCT ATT GGT AGT AAG TCA GAG CTT GAC AGT ATA TTG AAT TTA 
CAA AAA AGA TAA CCA TCA TTC AGT CTC GAA CTG TCA TAT AAC TTA AAT 
Val Phe Ser He Gly Ser Lys Ser Glu Leu Asp Ser He Leu Asn Leu> 



340 



350 360 370 380 



AGA AGA ATT CTT ACA GGG TAT TTA ATA AAG TCT TTC GAT TAT GAC AGG 
TCT TCT TAA GAA TGT CCC ATA AAT TAT TTC AGA AAG CTA ATA CTG TCC 
Arg Arg He Leu Thr Gly Tyr Leu He Lys Ser Phe Asp Tyr Asp Arg> 



FIGURE 16 (i of 5) 



WO 95/12676 





390 






* 


* 




* 


TCT 


AGT 


GCA 


GAA 


AGA 


TCA 


CGT 


CTT 


Ser 


Ser 


Ala 


Glu 




440 




• 




* 




TAT 


AGA 


GGA 


GAT 


ATA 


TCT 


CCT 


CTA 


Tyr 


Arg 


Gly 


Asp 






490 




• 






TTA 


AAG 


TCT 


TTA 


AAT 


TTC 


AGA 


AAT 


Leu 


Lys 


Ser 


Leu 


530 






540 


• 




♦ 




CAG 


TGG 


GCT 


GGA 


GTC 


ACC 


CGA 


CCT 


Gin 


Trp 


Ala 


Gly 


580 




5 




* 


* 




TTG 


TCT 


GGA 


AAT 


AAC 


AGA 


CCT 


TTA 


Leu 


Ser 


Gly 


Asn 




630 








* 




• 


GAT 


AAG 


GTG 


GTG 


CTA 


TTC 


CAC 


CAC 


Asp 


Lys 


Val 


Val 




680 




• 




• 




TTT 


GCA 


AGA 


GAT 


AAA 


CGT 


TCT 


CTA 


Phe 


Ala 


Arg 


Asp 






730 




* 




* 


CAA 


GAT 


AAA 


ATT 


GTT 


CTA 


TTT 


TAA 


Gin 


Asp 


Lys 


He 


770 






780 


* 




• 


* 


AAT 


ATA 


ACA 


GAA 


TTA 


TAT 


TGT 


CTT 


Asn 


He 


Thr 


Glu 



PCTAJS94/ 12352 
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1 ATGAAAAAAT TGTTACTAAT CTTTAGTTTT TTTCTTATTT CTTTGAATGG ATTTCCTCTT 
61 AATTCAAGGG AAGTTGATAA GGAAAAATTA AAGGATTTTG TTAATATGGA TCTTGAGTTT 
121 GTAAACTATA AAGGTCCTTA TGATTCTACA AATACATATG AACAAATAGT AGGTATTGGT 
181 GAGTTTTTAG CAAGACCATT GATTAATTCC AATAGCAACT CAATTTATTA TGGTAA ATAT 
241 TTTATTAATA GATTTATTGA TGATCAAGAT AAAAAAGCAA GCGTTGATGT TTTTTCTATT 
301 GGTAGTAGGT CACAGCTTGA CAGTATATTG AATCTAAGAA GAATTCTTAC AGGGTATTTG 
361 ATAAAGTCTT TTGATTATGA AAGATCTAGT GCTGAATTAA TTGCTAAGGT TATTACAATA 
421 CATAATGCTG TTTATAGAGG GGATTTAAAT TATTATAAAG AGGTTTATAT TGAGGCTGCT 
481 TTAAAGTCTT TAACTAAAGA AAATGCAGGT CTTTCTAGAG TGTACAGTCA ATGGGCTGGA 
541 AAGACACAAA TATTTATTCC TCTTAAAAAG AATATTTTAT CTGGAAAAGT TGAGTCTGAC 
601 ATTGATATTG ACAGTTTGGT TACAGATAAG GTTGTGGCAG -GTGTTTTAAG CGAGAATGAA 
661 GCAGGTGTTA ACTTTGCAAG AGATATTACA GATATTCAAG GCGAAACTCA TAAAGCAGAT 
721 CAAGATAAAA TTGATATTGA ATTAGATAAT GTTCATAAAA GTGATTCCAA TATAACAGAG 
781 ACTATTGAGA ATTTAAGAGA TCAGCTTGAA AAGGCTACAG ATGAAGAGCA TAGAAAAGAG 
841 ATTGAAAGTC AGGTTGATGC TAAAAAGAAA CAAAAAGAAG AACTAGATAA AAAGGCAATC 
901 GATCTTGATA AAGCCCAACA AAAATTAGAT TCTTCTGAAG ATAATTTAGA TATTCAAAGG 
961 GATACTGTTA GAGAGAAGAT TCAAGAGGAT ATTGACGAGA TTAATAAAGA AAAGAATTTG 
1021 CCAAAACCTG GTGATGTAAG TTCTCCTAAA GTTGATAAGC AGCTACAAAT AAAAGAGAGT 
1081 CTAGAAGACT TGCAGGAACA GCTTAAAGAA ACTAGCGATG AAAATCAAAA AAGAGAAATT 
1141 GAAAAGCAAA TTGAAATCAA AAAAAGTGAT GAAGAACTTT TAAAAAGTAA AGATCCTAAA 
1201 GCATTAGATC TTAATGGAGA TTTAAATTCT AAAGTTTCTA GTAAAGAAAA AATTAAAGGC 
1261 AAAGAAGGAG AAATAGTCAA AGAGGAATCA AAGGCAAGTT TAGCTGATTT GAATAATGAC 
1321 GAAAATCTTA TGAGGCCGGA AGATCAAAAA TTATCTGAGG ATAAAAAATT AGATAGTAAA 
1381 AAAAATTTAA AACCTGTTTC TGAGATTGAG AGAGTAAATG AAATTTCGAA GTCTAACAAC 
1441 AATGAGATTA GTGAATCATC ACCATTATAT AAGCCTTCTT ATAGCGATAT GGATTCAAAA 
1501 GAGGGTATAG ATAATAAAGA TGTTAACTTG CAAGAAACCA AGTCTCAAAC TAAAAGTCAA 
1561 CCTACTTCTT TAAATCAAGA TTTGACTACT ATGTCTATAG ATTCTAGTAA TCCTGTATTT 
1621 TTAGAGGTTA TTGATCCTAT TACAAATTTA GGAACGCTTC AACTTATTGA TTTGAATACC 
1681 GGTGTTAGAC TTAAAGAAAG TACTCAGCAA GGCATTCAGC GGTATGGAAT TTATGAACGT 
1741 GAAAAAGATT TAGTTGTTAT TAAAATGGAT TCAGGAAAAG CCAAGCTTCA AATACTTAAT 
1801 AAACTTGAGA ATTTAAAAGT GATATCGGAG TCTAATTTTG AGATTAATAA AAATTOVTCT 
1861 CTTTATGTTG ACTCTAAAAT GATTTTAGTA GTTGTGAGAG ATAGTGGTAA TGTTT GGAGA 
1921 TTGGCTAAAT TTTCTCCTAA AAATTTAAAT GAGTTTATTC TTTCAGAGAA TAAAATTTTG 
1981 CCTTTTACTA GCTTTTCTGT GAGAAAGAAT TTTATTTATT 1GCAGGATGA GTTTAAAAGT 
2041 CTTATTACTT TAGATGTAAA TACTTTAAAA AAAGTTAAGT A 
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1 ATGAAAAAAA TGTTACTAAT CTTTAGTTTT 
61 AATGCAAGGG AAGTTGATAA GGAAAAATTA 
121 GTTAATTACA AGGGTCCTTA TGATTCTACA 
181 GAGTTTTTAG CAAGGCCGTT GAACAATTCC 
241 TTTGTTAATA GATTTATTGA CGATCAAGAT 
301 GGTAGTAAGT CAGAGCTTGA TAGTATATTA 
361 ATGAAGTCTT TTGATTATGA GAGGTCTAGT 
421 TATAATGCTG TTTATAGAGG AGATTTAGAT 
481 TTGAAGTCTT TGACTAAAGA AAATGCAGGT 
541 AAGACACAAA TATTTATTCC TCTTAAAAAG 
601 ATTGATATTG ATAGTTTGGT TACAGATAAG 
661 TCAGGTGTTA ACTTTGCAAG AGATATTACA 
721 CAAGATAAAA TTGATATTGA ATTAGATAAT 
781 ACTATTGAGA ATTTAAGGGA TCAGCTTGAA 
841 ATTGAAAGTC AGGTTGATGC TAAAAAGAAA 
901 GATCTTGATA AAGCTCAACA AAAATTAGAT 
961 GATACTGTTA GAGAGAAGCT TCAAGAAAAT 
1021 CCAAAGCCTG GTGATGTAAG TTCTCCTAAG 
1081 CTAGAAGATT TGCAAGAGCA GCTTAAAGAA 
1141 GAAAAGCAAA TTGAAATCAA AAAAAATGAT 
1201 GCATTAGATC TTAAGCAAGA ATTAAATTCT 
1261 GAAGAAGAGG ATAAAGAATT AGATAGTAAA 
1321 AAAGTAGATA AAATTTCCAA GTCTAACAAC 
1381 GAGCCTTCTT ATAGCGACAT TGATTCGAAA 
1441 CAAAAAACTA AACCCCAAGT TGAAAGTCAA 
1501 GTGTCTATAG ATTCCAGTAA TCCTGTCTTT 
1561 GGAACGCTTC AACTTATTGA TTTGAATACC 
1621 GGTATTCAGC GATATGGAAT TTATGAACGT 
1681 TCAGGAAAAG CTAAGCTTCA GATACTTGAT 
1741 TCTAATTTTG AGATTAATAA AAATTCAtCT 
1801 GTTGTTAAGG ACGATAGTAA TGCTTGGAGA 
1861 GAATTTATTC TGTCAGAAAA TAAAATTTTG 
1921 TTTATTTATT TGCAAGATGA ACTTAAAAGC 
1981 AAAGTTAAGT A 



TTTCTTGTTT TTTTAAATGG ATTTCCTCTT 
AAGGACTTTG TTAATATGGA TCTTGAATTT 
GATACATATG AACAAATAGT AGGTATTGGG 
AATAGTAATT CAAGTTATTA TGGTAAATAT 
AAAAAAGCAA GTGTTGATAT TTTTTCTATT 
AATCTAAGAA GAATTCTTAC AGGGTATTTA 
GCGGAATTAA TTGCTAAAGC TATTACAATA 
TATTACAAAG AGTTTTATAT TGAGGCTTCT 
CTTTCTAGGG TGTACAGTCA ATGGGCTGGG 
AATATTTTAT CTGGAAATGT TGAGTCTGAC 
GTGGTGGCAG CTCTTTTAAG TGAGAATGAA 
GACATTCAAG GCGAAACTCA TAAAGCAGAT 
TTTCATGAAA GTGATTCCAA TATAACAGAA 
AAAGCTACAG ATGAAGAGCA TAAAAAAGAG 
CAAAAGGAAG AATTAGATAA AAAGGCAATT 
TTTGCTGAAG ATAATCTAGA TATTCAAAGG 
ATTAACGAGA CTAATAAGGA AAAGAATTTA 
GTTGATAAGC AGTTGCAGAT AAAAGAGAGT 
GCTAGTGATG AAAATCAAAA AAGAGAAATA 
GAAGAACTTT TTAAAAATAA AGATCATAAA 
AAAGCTTCTA GTAAAGAAAA AATTGAAGGC 
AAAAATTTAG AGCCTGTTTC TGAGGCTGAT 
AATGAGGTTA GTAAATTATC CCCGTTAGAT 
GAGGGTGTAG ATAACAAAGA TGTTGATTTG 
CCTACTTCGT TAAATGAAGA TTTGATTGAT 
TTAGAGGTTA TCGATCCGAT TACAAATTTA 
GGTGTTAGAC TTAAAGAAAG TGCTCAACAA 
GAAAAAGATT TGGTTGTTAT TAAAATAGAT 
AAACTCGAGA ATTTAAAAGT GATATCAGAG 
CTTTATGTTG ACTCTAGAAT GATTTTAGTA 
TTGGCTAAAT TTTCTCCTAA AAATTTAGAT 
CCTTTTACTA GCTTTGCTGT GAGAAAGAAT 
TTAGTTACTT TAGATGTAAA TACTTTAAAA 
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1 ATGAAAAAAA TGTTACTAAT CTTTAGTTTT TTTCTTATTT CTTTGAATGG ATTTCCCCTT 
61 AATGCAAGGG AAGTTGATAA GGAAAAATTA AAGGACTTTG TTAATATGGA TCTTGAGTTT 
121 GTAAACTATA AAGGTCCTTA TGATTCTACA AATACATATG AACAAATAGT AGGTATTGGT 
181 GAGTTTTTAG CAAGACCATT GATTAATTTC AATAGCAACT CAAGTTATTA TGGTAAATAT 
241 TTTATTAATA GATTTATTGA CGATCAAGAT AAAAAAGCAA GCGTTGATGT TTTTTCTATT 
301 AGTAGTAAGT CACAGCTTGA CAGTATATTG AATTTAAGAA GAATTCTTAC AGGGTATTTG 
361 ATAAAGTCTT TTGATTATGA AAGATCTAGT GCTGAATTAA TTGCCAAGGT TATTACAATA 
421 CATAATGCTG TTTATAGAGG TGATTTAAAT TATTATAAAG AGTTTTATAT TGAGTCTGCT 
481 TTAAAGTCTT TAACTAAAGA AAATGCAGGT CTTTCTAGAG TGTACAGTCA ATGGGCTGGA 
541 AAGACACAAA TATTTATTCC TCTTAAAAAG AATATTTTAT CTGGAAAAAT TGAGTCTGAC 
601 ATTGATATTG ATAGTTTGGT TACAGATAAG GTTGTGGGAG -GTGTTTTAAG CGAAAATGAA 
661 GCAGGTGTTA ACTTTGCAAG GGATATTACA GATATTCAAG GAGAAACTCA TAAAGCAGAT 
721 CAAGATAAAA TTGATATTGA ATTAGATAAT GTTCATGAAA GTGATTCCAA TATAACAGAA 
781 ACTATTGAGA ATTTAAGAGA TCAGCTTGAA AAGGCTACAG ATGAAGAGCA TAGAAAAGAG 
841 ATTGAAAGTC AAGTTGATGC TAAAAAGAAA CAAAAAGAAG AACTAGATAA AAAGGCAATC 
901 GATCTTGATA AAGCCCAACA AAAATTAGAT TTTTCTGAAG ATAATTTAGA TATTCAAAGG 
961 GATACTGTTA GAGAGAAGAT TCAAGAGGAT ATTAACGAGA TTAATAAGGA AAAGAATTTA 
1021 CCAAAACCTG GTGATGTAAG TTCTCCTAAA GTTGATAAGC AGCTACAAAT AAAAGAGAGT 
1081 CTAGAAGACT TGCAGGAGCA GCTTAAAGAA ACTAGCGATG AAAATCAAAA AAGAGAAATT 
1141 GAAAAGCAAA TTGAAATCAA AAAAAGTGAT GAAGAACTTT TAAAAAGCAA AGATCCTAAA 
1201 GCATTAGATC TTAATCGAGA TTTAAATTCT AAAGCTTCTA GTAAAGAAAA AATTAAAGGC 
1261 AAAGAAAAAG AAATAGTCAA AGAGAAATCA AAGGTAAGTT TAGGTGATTT GGATAATGAC 
1321 GAAACCCTTA TGACGCCGGA AGATCAAAAA TTATCTGAGG ATAAAAAATT AGATAGTAAA 
1381 AAAAATTTAA AACCTGTTTC TGAGATTGAG AGAGTAAATG AAATTTCAAA GTCTAACAAC 
1441 AATGAGGTTA GCAAATCATC ACCATTAGAT AAGCCTTCTT ATAGTGATAT CGATTCAAAA 
1501 GAGGTTGTAG ATAATAAAGA TGTTAATTTG CAAGAAACCA AGCCTCAAGC TAAAAGTCAA 
1561 TCTACTTCTT TAAATCAAGA TTTGATTACT ATGTCTATAG ATTCTAGTAA TCCTGTATTT 
1621 TTAGAGGTTA TTGATCGTAT TACAAATTTA GGAATGCTTC AACTTATTGA TTTAAATACT 
1681 GGTGTTAGAC TTAAAGAAAG CACTCAGCAA GGCATTCAGC GTTATGGAAT TTATGAACGT 
1741 GAAAAAGATT TAGTTGTTAT TAAAATGGAT TCAGGAAAAG CTAAGCTTCA AATACTTAAT 1 ' 
1801 AAACTTGAGA ATTTAAAAGT GATATCAGAG TCTAATTTTG AGATTAATAA AAATTCATCT ' 
1861 CTTTATGTTG ACTCTAAAAT GATTTTAGTA GCTGTGAAAG ATAGTGGTAA TGTTTGGAGA 
1921 TTGGCTAAAT TTTCTCCTAA AAATTTAGAT GAGTTTATTC TTTCAGAGAA TAAAATTTTG 
1981 CCTTTTACTA GCTTTTCTGT GAGAAAGAAT TTTATTTATT TGCAAGATGA GTTTAAAAGT 
2041 CTTATTACTT TAGATGTAAA TACTTTAAAA AAAGTTAAGT A 
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1 ATGAAAAAAA TGTTACTAAT CTTTAGTTTT TTTCTTGTTT TTTTAAATGG ATTTCCTCTT 
61 AATGCAAGGG AAGTTGATAA GGAAAAATTA AAGGACTTTG TTAATATGGA TCTTGAATTT 
121 GTTAATTACA AGGGTCCTTA TGATTCTACA AATACATATG AACAAATAGT AGGTATTGGG 
181 GAGTTTTTAG CAAGGCCGTT GATCAATTCC AATAGTAATT CAAGTTATTA TGGTAA ATAT 
241 TTTGTTAATA GATTTATTGA CGATCAAGAT AAAAAAGCAA GTGTTGATAT TTTTTCTATT 
301 GGTAGTAAGT CAGAGCTTGA TAGTATATTA AATCTAAGAA GAATTCTTAC AGGGTATTTA 
361 ATGAAGTCTT TTGATTATGA GAGGTCTAGT GCGGAATTAA TTGCTAAAGC TATTACAATA 
421 TATAATGCTG TTTATAGAGG AGATTTAGAT TATTACAAAG AGTTTTATAT TGAGGCTTCT 
481 TTGAAGTCTT TGACTAAAGA AAATGCAGGT CTTTCTAGGG TGTACAGTCA ATGGGCTGGG 
541 AAGACACAAA TATTTATTCC TCTTAAAAAG AATATTTTAT CTGGAAATGT TGAGTCTGAC 
601 ATTGATATTG ATAGTTTGGT TACAGATAAG GTGGTGGCAG CTCTTTTAAG TGAGAATGAA 
661 TCAGGTGTTA ACTTTGCAAG AGATATTACA GACATTCAAG GCGAAACTCA TAAAGCAGAT 
721 CAAGATAAAA TTGATATTGA ATTAGATAAT ATTCATGAAA GTGATTCCAA TATAACAGAA 
781 ACTATTGAGA ATTTAAGGGA TCAGCTTGAA AAAGCTACAG ATGAAGAGCA TAAAAAAGAG 
841 ATTGAAAGTC AGGTTGATGC TAAAAAGAAA CAAAAGGAAG AATTAGATAA AAAGGCAATT 
901 GATCTTGATA AAGCTCAACA AAAATTAGAT TTTGCTGAAG ATAATCTAGA TATTCAAAGG 
961 GATACTGTTA GAGAGAAGCT TCAAGAGAAT ATTAACGAGA CTAATAAGGA AAAGAATTTA 
1021 CCAAAGCCTG GTGATGTAAG TTCTCCTAAA GTTGATAAGC AACTACAAAT AAAAGAGAGC 
1081 CTGGAAGATT TGCAGGAGCA GCTTAAAGAA ACTGGTGATG AAAATCAGAA AAGAGAAATT 
1141 GAAAAGCAAA TTGAAATCAA AAAAAGTGAT GAAAAGCTTT TAAAAAGTAA AGATGATAAA 
1201 GCAAGTAAAG ATGGTAAAGC CTTGGATCTT GATCGAGAAT TAAATTCTAA AGCTTCTAGC 
1261 AAAGAAAAAA GTAAAGCCAA GGAAGAAGAA ATAACCAAGG GTAAGTCACA GAAAAGCTTA 
1321 GGCGATTTGA ATAATGATGA AAATCTTATG ATGCCAGAAG ATCAAAAATT ACCTGAGGTT 
1381 AAAAAATTAG ATAGCAAAAA AGAATTTAAA CCTGTTTCTG AGGTTGAGAA ATTAGATAAG 
1441 ATTTTCAAGT CTAATAACAA TGTTGGAGAA TTATCACCGT TAGATAAATC TTCTTATAAA 
1501 GACATTGATT CAAAAGAGGA GACAGTTAAT AAAGATGTTA ATTTGCAAAA GACTAAGCCT 
1561 CAGGTTAAAG ACCAAGTTAC TTCTTTGAAT GAAGATTTGA CTACTATGTC TATAGATTCC 
1621 AGTAGTCCTG TATTTTTAGA GGTTATTGAT CCAATTACAA ATTTAGGAAC TCTTCAACTT 
1681 ATTGATTTAA ATACTGGTGT TAGGCTTAAA GAAAGCACTC AGCAAGGCAT TCAGCGGTAT 
1741 GGAATTTATG AACGTGAAAA AGATTTGGTT GTTATTAAAA TGGATTCAGG AAAAGCTAAG 
1801 CTTCAGATAC TTGATAAACT TGAAAATTTA AAAGTGGTAT CAGAGTCTAA TTTTGAGATT 
1861 AATAAAAATT CATCTCTTTA TGTTGATTCT AAAATGATTT TAGTAGCTGT TAGGGATAAA 
1921 GATAGTAGTA ATGATTGGAG ATTGGCCAAA TTTTCTCCTA AAAATTTAGA TGAGTTTATT 
1981 CTTTCAGAGA ATAAAATTAT GCCTTTTACT AGCTTTTCTG TGAGAAAAAA TTTTATTTAT 
2041 TTGCAAGATG AGTTTAAAAG TCTAGTTATT TTAGATGTAA ATACTTTAAA AAAAGTTAAG 
2101 TAAAGCC 
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1 ATGAAAAAAA TGTTACTAAT CTTTAGTTTT TTTCTTGTTT TTTTAAATGG ATTTCCTCTT 
61 AATGCAAGGG AAGTTGATAA GGAAAAATTA AAGGACTTTG TTAATATGGA TCTTGAATTT 
121 GTTAATTACA AGGGTCCTTA TGATTCTACA AATACATATG AACAAATAGT AGGTATTGGG 
181 GAGTTTTTAG CAAGGCCGTT GATCAATTCC AATAGTAATT CAAGTTATTA TGGTAAATAT 
241 TTTGTTAATA GATTTATTGA CGATCAAGAT AAAAAAGCAA GTGTTGATAT TTTTTCTATT 
301 GGTAGTAAGT CAGAGCTTGA TAGTATATTA AATCTAAGAA GAATTCTTAC AGGGTATTTA 
361 ATGAAGTCTT TTGATTATGA GAGGTCTAGT GCGGAATTAA T TGCTAA AGC TATTACAATA 
421 TATAATGCTG TTTATAGAGG AGATTTAGAT TATTACAAAG AGTTTTATAT TGAGGCTTCT 
481 TTGAAGTCTT TGACTAAAGA AAATGCAGGT CTTTCTAGGG TGTACAGTCA ATGGGCTGGG 
541 AAGACACAAA TATTTATTCC TCTTAAAAAG AATATTTTAT CTGGAAATGT TGAGTCTGAC 
601 ATTGATATTG ATAGTTTGGT TACAGATAAG GTGGTGGCAG .CICTTTTAAG TGAGAATGAA 
661 TCAGGTGTTA ACTTTGCAAG AGATATTACA GACATTCAAG GCGAAACTCA TAAAGCAGAT 
721 CAAGATAAAA TTGATATTGA ATTAGATAAT TTTCATGAAA GTGATTCCAA TATAACAGAA 
781 ACTATTGAGA ATTTAAGGGA TCAGCTTGAA AAAGCTACAG ATGAAGAGCA TAAAAAAGAG 
841 ATTGAAAGTC AGGTTGATGC TAAAAAGAAA CAAAAGGAAG AATTAGATAA AAAGGCAATT 
QQ1 GATCTTGATA AAGCTCAACA AAAATTAGAT TTTGCTGAAG ATAATCTAGA TATTCAAAGG 
961 GATACTGTTA GAGAGAAGCT TCAAGAAAAT ATTAACGAGA CTAATAAGGA AAAGAATTTA 
1021 CCAAAGCCTG GTGATGTAAG TTCTCCTAAG GTTGATAAGC AGTTGCAGAT AAAAGAGAGT 
1081 CTAGAAGATT TGCAAGAGCA GCTTAAAGAA GCTAGTGATG AAAATCAAAA AAGAGAAATA 
1141 GAAAAGCAAA TTGAAATCAA AAAAAATGAT GAAGAACTTT TTAAAAATAA AGATCATAAA 
1201 GCATTAGATC TTAAGCAAGA ATTAAATTCT AAAGCTTCTA GTAAAGAAAA AATTGAAGGC 
1261 GAAGAAGAGG ATAAAGAATT AGATAGTAAA AAAAATTTAG AGCCTGTTTC TGAGGCTGAT 
1321 AAAGTAGATA AAATTTCCAA GTCTAACAAC AATGAGGTTA GTAAATTATC CCCGTT AGAT 
1381 GAGCCTTCTT ATAGCGACAT TGATTCGAAA GAGGGTGTAG ATAACAAAGA TGTTGATTTG 
1441 CAAAAAACTA AACCCCAAGT TGAAAGTCAA CCTACTTCGT TAAATGAAGA CTTGATTGAT 
1501 GTGTCTATAG ATTCCAGTAA TCCTGTCTTT TTAGAGGTTA TCGATCCGAT TACAAATTTA 
1561 GGAACGCTTC AACTTATTGA TTTGAATACC GGTGTTAGAC TTAAAGAAAG TGCTCAACAA 
1621 GGTATTCAGC GATATGGAAT TTATGAACGT GAAAAAGATT TGGTTGTTAT TAAAATAGAT 
1681 TCAGGAAAAG CTAAGCTTCA GATACTTGAT AAACTCGAGA ATTTAAAAGT G ATATC AGAG 
1741 TCTAATTTTG AGATTAATAA AAATTCATCT CTTTATGTTG ACTCTAGAAT GATTTTAGTA 
1801 GTTGTTAAGG ACGATAGTAA TGCTTGGAGA TTGGCTAAAT T TTCT CCTAA AAATTTAGAT 
1861 GAATTTATTC TGTCAGAAAA TAAAATTTTG CCTTTTACTA GCTTTGCTGT GAGAAAGAAT 
1921 TTTATTTATT TGCAAGATGA ACTTAAAAGC TTAGTTACTT TAGATGTAAA TACTTTAAAA 
1981 AAAGTTAAGT A 



FIGURE 21 



PMSDOCID <WO 9512676A1> 



WO 95/12676 W PCT/US94/ 12352 



p93 - 25015 

1 ATGAAAAAAA TGTTACTAAT CTTTAGTTTT TTTCTTATTT TTTTGAATGG ATTTCCTCTT 
61 AATGCAAGGA AAGTTGATAA GGAAAAATTA AAGGATTTTG TTAATATGGA TCTTGAGTTT 
121 GTAAATTATA AAGGTCCTTA TGATTCTACA AATACGTATG AACAAATAGT GGGTATTGGG 
181 GAGTTTTTAG CAAGACCGCT GACCAATTCC AATAGCAACT CAAGTTATTA TGGCAA ATAT 
241 TTTATTAATA GATTTATTGA TGATCAAGAT AAAAAAGCAA GTGTTGATGT TTTTTCTATA 
301 AGCAGCAAAT CAGAGCTTGA CAGTATATTG AATTTAAGAA GAATTCTTAC AGGGTATATA 
361 ATAAAGTCTT TCGATTATGA CAGGTCTAGT GCAGAATTAA TTGCTAAGGT TATTACAATA 
421 TATAATGCTG TTTATAGAGG AGATTTGGAT TATTATAAAG GGTTTTATAT TGAGCCTGCT 
481 TTGAAGTCTT TAACTAAAGA AAACGCAGGT CTTTCTAGGG TTTACAGTCA GTGGGCTGGA 
541 AAGACTCAAA TATTTATTCC TCTTAAAAAG GATATTTTGT CT GGAAA TAT TGAATCTGAC 
601 ATTGATATTG ACAGTTTGGT TACAGATAAG GTGATAGCAG CTCTTTTAAG CGAAAATGAA 
661 GCAGGCGTTA ACTTTGCAAG AGATATTACA GATATTCAAG GCGAAACTCA TAAGGCAGAT 
721 CAAGATAAGA TTGATACTGA ATTAGACAAT ATCCATGAAA GCGATTCTAA TATAACAGAA 
781 ACTATTGAAA ATTTAAGGGA TCAGCTTGAA AAAGCTACAG ATGAAGAGCA TAAAAAAGAG 
841 ATTGAAAGTC AGGTTGATGC TAAAAAGAAA GAAAAGGAAG AGCTAGATAA AAAGGCAATC 
901 AATCTTGATA AAGCTCAGCA AAAATTAGAC TCTGCTGAAG ATAATTTAGA TGTTCAAAGA 
961 GATACTGTTA GAGAGAAAAT TCAAGAGGAT ATTAATGAGA TTAATAAGGA AAAGAATTTG 
1021 CCAAAACCTG GTGATGTAAG TTCTCCTAAA GTTGATAAGC AACTGCAAAT AAAAGAGAGT 
1081 CTAGAAGATT TGCAGGAGCA GCTTAAAGAA GCTGGTGATG AAAATCAGAA AAGAGAAATT 
1141 GAGAAGCAAA TTGAAATCAA AAAAAGGGAC GAAGAACTTT TAAAAAGTAA AGATGGCAAA 
1201 GTAAGTAAAG ATTATGAAGC ATTAGATCTT GATCGAGAAT TATCCAAAGC TTCTAGTAAA 
1261 GAAAAAAGTA AGGTCAAGGA AGAAGAAATA ACTAAAGGTA AATCACGGGC AAGCTTAGGC 
1321 GATTTGAATA ATGATAAAAA CCTTATGTTG CCAGAAGATC AA AAATT ACC TGAAGATAAA 
1381 AAATTGGATA GTAAATTAGA TGGTAAAAAA GAATTTAAAC CAGTTTCTGA GGTTGAAAAA 
1441 TTAGATAAGA TTTCCAAGTC TAATAACAAT GAGGTTGGCA AGTTATCACC ATTAGATAAG 
1501 CCTTCTTATG ATGATATTGA TTCAAAAGAG GAGGTAGATA ATAAAGCTAT TAATTTGCAA 
1561 AAGATCGACC CTAAAGTTAA AGACCAAACT ACTTCTTTGA ATGAAGATTT GGATAAAGAT 
1621 TTGACTACTA TGTCTATAGA TTCCAGCAGT CCTGTATTTC TAGAGGTTAT TGATCCTATT 
1681 ACAAATTTAG GAACCCTGCA GCTTATTGAT TTAAATACTG GGGTTAGGCT TAAGGAAAGC 
1741 ACTCAGCAAG GCATTCAGCG GTATGGAATT TATGAACGTG AAAAAGATTT GGTTGTTATT 
1801 AAAATGGATT CAGGAAAGGC TAAGCTTCAA ATACTTAATA AGCTTGAAAA TTTGAAAGTG 
1861 GTATCAGAGT CTAATTTTGA GATCAATAAA AATTCATCTC TTTATGTTGA CTC TAAAA TG 
1921 ATTTTGGCAG CTGTTAGAGA TAAGGATGAT AGCAATGCTT GGAGA T TGGC TAAATT TTCT 
1981 CCTAAAAATT TGGATGAGTT TATTCTTTCA GAGAATAAAA TTTTGCCTTT TA CTAG CTTT 
2041 TCTGTGAGAA AAAATTTTAT TTATTTGCAA GATGAGCTTA AAAATCTAGT TATTTTAGAT 
2101 GTAAATACTT TAAAAAAAGT TAAGTA 
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